Anyone using Flash Attention 2 (ai-bond) on their V100's? How is the performance?
Signal
72
Hype
25
In three linesUser benchmarks Flash Attention 2 (ai-bond) on V100. Results show 7-24x speedup in backward pass, memory reduction up to 91.9% (323.4 MB saved). Thinking time before answering minimized. Numerical validation passes on causal and non-causal configurations.Read source
Your take?
Summary generated by Claude — human-verified