Back to feed
arXiv cs.CL·

ChildEval: When large language models meet children's personalities

Signal
72
Hype
25
In three linesChildEval is a benchmark with 29K synthesized child personality profiles (ages 3-6) to evaluate LLMs' ability to infer and follow child-centered preferences in long-context conversations. The dataset covers 5 top-level and 14 sub-level categories of daily life. Results show that fine-tuning on ChildEval enhances child-centered performance.
Read source
Your take?
BenchmarksFine-tuningEvalsPapers

Summary generated by Claude — human-verified