Back to feed
arXiv cs.AI·

Efficient Emotion-Aware Iconic Gesture Prediction for Robot Co-Speech

Signal
72
Hype
25
In three linesA lightweight transformer predicts robot co-speech iconic gestures from text and emotion alone, without audio at inference time. The model outperforms GPT-4o on semantic gesture placement classification and intensity regression on the BEAT2 dataset, while remaining computationally compact for real-time embodied agent deployment.
Read source
Your take?
RoboticsReasoningBenchmarks

Summary generated by Claude — human-verified