Back to feed
arXiv cs.CL·

Raon-Speech Technical Report

Signal
82
Hype
25
In three linesRaon-Speech is a 9B multilingual speech language model (English/Korean) that understands and generates speech while preserving text capabilities. Trained on 1.38M hours of curated data, it outperforms 8 comparable audio models (Qwen2.5-Omni, Fun-Audio-Chat) across 42 benchmarks. Raon-SpeechChat extends it with real-time full-duplex conversation trained on 119K hours of dialogue.
Read source
Your take?
VoiceBenchmarksOpen sourceMulti-agent

Summary generated by Claude — human-verified