TeachObs: A Human-Validated Benchmark for Multimodal Teaching Observation and Model Evaluation
Signal
78
Hype
15
In three linesTeachObs is a human-validated multimodal benchmark for classroom video analysis. It contains 30 public lessons from 8 countries split into 5,158 15-second scenes, annotated by 7 researchers with 39 observation codes (20 visual, 19 non-visual). Evaluation of 5 vision-capable LLMs across 3 tasks: no single model consistently outperforms others.Read source
Your take?
Summary generated by Claude — human-verified