Back to feed
arXiv cs.CL·

Transcribing Children's Speech: ASR Performance and Obtaining Reliable Orthographic Transcriptions

Signal
72
Hype
15
In three linesComparative study of 9 ASR models (Whisper, Parakeet, Wav2Vec2) on child speech in Dutch. Fine-tuned Whisper-medium achieves 5.54% WER on JASMIN and 70.37% on DART. An utterance-level selection method identifies 42% (JASMIN) and 18.1% (DART) of utterances as correctly pronounced with ≥98.3% precision, reducing manual verification needs.
Read source
Your take?
BenchmarksVoiceEvals

Summary generated by Claude — human-verified