Back to feed
Reddit r/LocalLLaMA·

Get in here: Community model build thread

Signal
35
Hype
45
In three linesA Reddit thread proposes building a community model through distributed compute using a Mixture-of-Experts (MoE) approach. The 'Branch-Train-Stitch' strategy distributes a dense prototype model to participants who train it independently on their hardware, then merge the submodels into an MoE. Key decisions include prototype size (2B or 7B) based on available VRAM.
Read source
Your take?
Open sourceFine-tuning

Summary generated by Claude — human-verified