MindZero: Learning Online Mental Reasoning With Zero Annotations
MindZero is a self-supervised reinforcement learning framework training multimodal LLMs to infer human mental states without annotations. The model is rewarded for generating mental state hypotheses that maximize the likelihood of observed actions. After training, inference becomes fast single-pass and outperforms model-based methods in both accuracy and efficiency.