TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models
Signal
75
Hype
15
In three linesTrustLDM is a trustworthiness benchmark for Language Diffusion Models (LDMs) covering safety, privacy, and fairness. Results show LDMs degrade alignment when malicious post contexts are attached to masked responses, regardless of context length. An automatic evaluation framework (TrustLDM-Auto) systematically identifies vulnerable configurations across all tested models.Read source
Your take?
Summary generated by Claude — human-verified