1b:["$","section",null,{"className":"related-section","children":[["$","h2",null,{"className":"related-section__h","children":"Other angles on this story"}],["$","ul",null,{"className":"related-list","children":[["$","li","a9a3f0bf-6da4-415a-bc9f-a86f87ae01e3",{"className":"related-item","children":[["$","span",null,{"className":"related-item__num","children":"01"}],["$","$L15",null,{"href":"/en/article/plus-petitsvg-aria-hiddentrue-data-componentocticon-height16-viewbox0-0-16-16-ve-a9a3f0","className":"related-item__title","children":" openai / whisper"}],["$","span",null,{"className":"related-item__score","children":["SIG ",85]}]]}],["$","li","5fffe51f-c7d9-401c-ac01-fa6df31b3679",{"className":"related-item","children":[["$","span",null,{"className":"related-item__num","children":"02"}],["$","$L15",null,{"href":"/en/article/leap-supercharging-llms-for-formal-mathematics-with-agentic-frameworks-5fffe5","className":"related-item__title","children":"LEAP: Supercharging LLMs for Formal Mathematics with Agentic Frameworks"}],["$","span",null,{"className":"related-item__score","children":["SIG ",85]}]]}],["$","li","455632ca-f415-4085-a04f-9edd8ffc8a30",{"className":"related-item","children":[["$","span",null,{"className":"related-item__num","children":"03"}],["$","$L15",null,{"href":"/en/article/itbench-aa-frontier-models-score-below-50pourcent-on-the-first-benchmark-for-age-455632","className":"related-item__title","children":"ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM"}],["$","span",null,{"className":"related-item__score","children":["SIG ",85]}]]}],["$","li","e7ce8be4-30d1-48b0-bf80-7ca8a1e51846",{"className":"related-item","children":[["$","span",null,{"className":"related-item__num","children":"04"}],["$","$L15",null,{"href":"/en/article/openai-shifts-the-boundary-of-automated-reasoning-with-a-milestone-in-ai-mathema-e7ce8b","className":"related-item__title","children":"OpenAI shifts the boundary of automated reasoning with a \"milestone in AI mathematics\" that experts are now unpacking"}],["$","span",null,{"className":"related-item__score","children":["SIG ",85]}]]}],["$","li","380489ed-d12b-40e8-8c63-74ffaf01fbad",{"className":"related-item","children":[["$","span",null,{"className":"related-item__num","children":"05"}],["$","$L15",null,{"href":"/en/article/plus-petitsvg-aria-hiddentrue-data-componentocticon-height16-viewbox0-0-16-16-ve-380489","className":"related-item__title","children":" openai / whisper"}],"$L1e"]}]]}]]}]

How Off-Policy Can GRPO Be? Mu-GRPO for Efficient LLM Reinforcement Learning

Other angles on this story