arXiv cs.AI·19 May 2026

The threat of analytic flexibility in using large language models to simulate human data

Signal

Hype

In three linesarXiv study demonstrating that analytic choices (model selection, sampling parameters, prompt format, demographic data) materially affect the fidelity of "silicon samples" (synthetic datasets generated by LLMs). Across 252 configurations tested, correlations with human data range from r=.23 to r=.84, revealing a major risk of analytic flexibility.

Read source

Your take?

Llama Evals AI safety Papers

Summary generated by Claude — human-verified

The threat of analytic flexibility in using large language models to simulate human data

Other angles on this story