Reddit r/MachineLearning·23 May 2026

I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]

Signal

Hype

In three linesLoRA fine-tuning experiment comparing three data formats for C-3PO persona injection: chat demos, first-person statements, and synthetic Wikipedia docs. First-person statements win on generalization. Synthetic docs produce paradoxical behavior: model knows C-3PO is anxious but expresses it only 37% of the time.

Read source

Your take?

Fine-tuning Prompt engineering Papers

Summary generated by Claude — human-verified

I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]

Other angles on this story