Back to feed
Reddit r/LocalLLaMA·

I fine-tuned Cohere Transcribe to support diarization and timestamps

Signal
72
Hype
25
In three linesDeveloper fine-tuned Cohere Transcribe to add diarization (speaker identification) and timestamps. Model outputs parsable format with average temporal precision of ±0.097s. Supports up to 4 speakers per 30s, extensible to 32 with diarize_long.py script. Available free on Hugging Face.
Read source
Your take?
Open sourceFine-tuningVoice

Summary generated by Claude — human-verified