Back to feed
arXiv cs.CL·

FormalASR: End-to-End Spoken Chinese to Formal Text

Signal
75
Hype
15
In three linesFormalASR introduces two compact models (0.6B and 1.7B parameters) that directly transcribe spoken Chinese into formal written text without an ASR+LLM pipeline. Fine-tuned on WenetSpeech-Formal and Speechio-Formal using supervised fine-tuning of Qwen3-ASR, they achieve 37.4% relative CER reduction over verbatim baselines and improve ROUGE-L and BERTScore.
Read source
Your take?
QwenCode generationFine-tuningBenchmarks

Summary generated by Claude — human-verified