Back to feed
arXiv cs.AI·

Advancing Creative Physical Intelligence in Large Multimodal Models

Signal
75
Hype
25
In three linesMM-CreativityBench, a new benchmark, evaluates large multimodal models' ability to solve creative problems by identifying non-obvious object uses in physically constrained environments. Current LMMs fail due to insufficient grounded exploration and hallucinations. Affordance-grounded alignment via Direct Preference Optimization reduces these errors and improves entity selection.
Read source
Your take?
BenchmarksVisionReasoningAlignment

Summary generated by Claude — human-verified