arXiv cs.AI·19 May 2026

On the Adversarial Robustness of Large Vision-Language Models under Visual Token Compression

Signal

Hype

In three linesStudy of adversarial robustness in compressed vision-language models. Authors propose CAGE attack that exploits the mismatch between perturbation optimization (full tokens) and inference (via compression). CAGE combines expected feature disruption and rank distortion alignment to expose hidden vulnerabilities in compressed LVLMs.

Read source

Your take?

Vision AI safety Benchmarks

Summary generated by Claude — human-verified

On the Adversarial Robustness of Large Vision-Language Models under Visual Token Compression

Other angles on this story