nvidia/Qwen3.6-35B-A3B-NVFP4 · Hugging Face
Signal
75
Hype
15
In three linesNVIDIA quantized Alibaba's Qwen3.6-35B-A3B model to NVFP4 (4-bit) using Model Optimizer. Weight reduction from 16 to 4 bits per parameter cuts GPU memory and disk size by ~3.06x. Benchmark results show minimal accuracy loss: MMLU Pro 85.6→85.0, GPQA Diamond 84.9→84.8.Read source
Your take?
Summary generated by Claude — human-verified