Back to feed
arXiv cs.AI·

Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation

Signal
78
Hype
25
In three linesOptimusVLA, a hierarchical Vision-Language-Action model, improves robotic manipulation via two memories: Global Prior Memory (replaces Gaussian noise with trajectory priors) and Local Consistency Memory (enforces temporal coherence). Results: 98.6% on LIBERO, +13.5% vs pi_0 on CALVIN, 2.9x inference speedup.
Read source
Your take?
VisionRoboticsAI AgentsBenchmarks

Summary generated by Claude — human-verified