How to fine-tune an LLM for open-ended problems? [P]
Signal
35
Hype
15
In three linesResearcher asks how to fine-tune an LLM for open-ended math problems (proofs). Standard SFT and RLHF inadequate; seeks appropriate method using MathNet dataset.Read source
Your take?
Summary generated by Claude — human-verified