Back to feed
Reddit r/LocalLLaMA·

StepFun 3.7 Flash - Speed Benchmark in M5 Max

Signal
65
Hype
15
In three linesStepFun 3.7 Flash benchmark on M5 Max (128 GB) with llama.cpp. Short contexts (<16k tokens) fast and responsive. 32k-64k contexts usable. Detailed metrics: 65k tokens reaches 360.79 t/s token generation.
Read source
Your take?
Open sourceBenchmarksInfrastructure

Summary generated by Claude — human-verified