Back to feed
arXiv cs.AI·

The threat of analytic flexibility in using large language models to simulate human data

Signal
75
Hype
25
In three linesarXiv study demonstrating that analytic choices (model selection, sampling parameters, prompt format, demographic data) materially affect the fidelity of "silicon samples" (synthetic datasets generated by LLMs). Across 252 configurations tested, correlations with human data range from r=.23 to r=.84, revealing a major risk of analytic flexibility.
Read source
Your take?
LlamaEvalsAI safetyPapers

Summary generated by Claude — human-verified