Back to feed
Reddit r/LocalLLaMA·

Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server

Signal
35
Hype
72
In three linesXiaomi announces MiMo-V2.5-Pro UltraSpeed achieving 1,000+ tokens/sec on a 1 trillion parameter MoE model using standard 8-GPU server, without custom hardware.
Read source
Your take?
Open sourceBenchmarksInfrastructure

Summary generated by Claude — human-verified