Back to feed
arXiv cs.CL·

Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving

Signal
78
Hype
25
In three linesFast-dDrive is a block-diffusion VLA (Vision-Language-Action) model for autonomous driving. It combines bidirectional refinement within semantic units with strict causal ordering, handles structured JSON outputs, and achieves 12× throughput speedup with SGLang. On nuScenes, L2 error reduced to 0.32m (22% improvement), SOTA on WOD-E2E.
Read source
Your take?
VisionCode generationReasoningBenchmarksRobotics

Summary generated by Claude — human-verified