arXiv cs.AI·19 May 2026

EfficientTDMPC: Improved MPC Objectives for Sample-Efficient Continuous Control

Signal

Hype

In three linesEfficientTDMPC improves sample efficiency for continuous control in model-based reinforcement learning. The method uses an ensemble of dynamics models, averages return estimates across multiple rollout depths, and adds an uncertainty penalty to the planner objective. It achieves SOTA on HumanoidBench-Hard and DMC hard in low-data regimes.

Read source

Your take?

Reinforcement learning Benchmarks Papers

Summary generated by Claude — human-verified

EfficientTDMPC: Improved MPC Objectives for Sample-Efficient Continuous Control

Other angles on this story