Back to feed
Reddit r/LocalLLaMA·

Me train LLM on 8GB from Scratch. Me happy

Signal
45
Hype
25
In three linesA developer created a script to train a small model (25M parameters) on TinyStories with only 8GB VRAM. After testing multiple techniques (mHC, BitNet, TurboQuant, MTP), only MTP works properly, though slower. Code and model available on GitHub and Hugging Face.
Read source
Your take?
Open sourceFine-tuningInfrastructure

Summary generated by Claude — human-verified