Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation
Signal
78
Hype
25
In three linesOptimusVLA, a hierarchical Vision-Language-Action model, improves robotic manipulation via two memories: Global Prior Memory (replaces Gaussian noise with trajectory priors) and Local Consistency Memory (enforces temporal coherence). Results: 98.6% on LIBERO, +13.5% vs pi_0 on CALVIN, 2.9x inference speedup.Read source
Your take?
Summary generated by Claude — human-verified