Back to feed
OpenAI Blog·

Evaluating large language models trained on code

Signal
65
Hype
25
In three linesOpenAI publishes an evaluation methodology for large language models trained on code. The study proposes benchmarks and criteria to measure the quality and performance of code generation models.
Read source
Your take?
OpenAICode generationBenchmarksEvals

Summary generated by Claude — human-verified