Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models
Signal
78
Hype
25
In three linesStudy of 6,233 MedGPTs and 10 open-source models deployed on the web. 25-30% show low factual accuracy, 33.6-54.3% violate operational thresholds, 57% of Action-enabled models lack privacy disclosures. Authors introduce MedGPT-HEval for hallucination detection and release HAA-MedGPT, a structured dataset.Read source
Your take?
Summary generated by Claude — human-verified