Real-time multilingual ASR using rolling buffers and monolingual models [P]
Signal
78
Hype
25
In three linesReal-time multilingual ASR system routing audio between specialized monolingual models (~100M parameters each) instead of one large model. Detects language switches via SpeechBrain and re-transcribes with correct model. Achieves 13% WER on inter-utterance code-switching, outperforming cloud APIs. Open-source repo released.Read source
Your take?
Summary generated by Claude — human-verified