Back to feed
arXiv cs.AI·

UniScale: Adaptive Unified Inference Scaling via Online Joint Optimization of Model Routing and Test-Time Scaling

Signal
75
Hype
25
In three linesUniScale unifies model routing and test-time scaling (TTS) in a single optimization space to balance LLM inference quality and computational cost. The framework uses LinUCB and contextual multi-armed bandit theory to learn adaptive inference policies online, with cost modeling and efficiency-aware learning.
Read source
Your take?
ReasoningMulti-agent

Summary generated by Claude — human-verified