Back to feed
Hacker News (AI)·

768GB Intel Optane DIMMs to run 1T-parameter LLM with single GPU at 4tps

Signal
35
Hype
45
In three lines768GB Intel Optane DIMMs enable running a 1-trillion-parameter LLM on a single GPU at 4 tokens/second. Hardware configuration for inference of very large models without distributed infrastructure.
Read source
Your take?
InfrastructureBenchmarks

Summary generated by Claude — human-verified