arXiv cs.LG·2 June 2026

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Signal

Hype

In three linesRAFT is a two-stage domain fine-tuning method that mitigates catastrophic forgetting. It refines data via self-conditioned rewriting and answer fusion, then applies on-policy distillation where the original model provides soft targets on student-generated trajectories. Across five domains, RAFT improves domain accuracy by 23.2% over standard SFT and recovers 18.2% of degradation on MS-Bench.

Read source

Your take?

Fine-tuning Reinforcement learning Papers

Summary generated by Claude — human-verified

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Other angles on this story