Back to feed
arXiv cs.CL·

OpenCompass: A Universal Evaluation Platform for Large Language Models

Signal
75
Hype
25
In three linesOpenCompass is an open-source LLM evaluation platform featuring a modular architecture with 5 core components: configuration system, task partitioning, execution/scheduling, task execution unit, and result visualization. Supports rule-based, LLM-as-a-Judge, and cascaded evaluators across multi-domain benchmarks (knowledge, reasoning, code, science).
Read source
Your take?
BenchmarksEvalsOpen sourceTools

Summary generated by Claude — human-verified