Back to feed
arXiv cs.LG·

NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models

Signal
75
Hype
25
In three linesNumLeak measures memorization of public benchmarks in frontier LLMs. Models recall Fama-French data (r=0.97-0.99), US unemployment, and NOAA temperature with high fidelity. On recent unseen data, parse rate drops to 21-57% but r stays ~0.99 for answered months. A one-line system-prompt defense blocks 99.8% of attacks.
Read source
Your take?
BenchmarksEvalsAI safetyAlignment

Summary generated by Claude — human-verified