Back to feed
arXiv cs.CL·

Benchmarking Open-Source Safety Guard Models: A Comprehensive Evaluation

Signal
82
Hype
18
In three linesComprehensive evaluation of 14 open-source safety guard models on 79,331 samples across 8 NIST AI Risk Framework categories. Qwen Guard (4B) achieves highest recall (83.97%), outperforming Llama Guard (12B) and GPT-OSS Safeguard (20B). Model size does not correlate with safety detection performance.
Read source
Your take?
BenchmarksAI safetyOpen sourceQwenLlama

Summary generated by Claude — human-verified