arXiv cs.CL·28 May 2026

DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification

Signal

Hype

In three linesDecomposeRL combines accurate claim verification with inspectable traces using RL (GRPO). A 7B model trained on 5K curated claims achieves 86.3% in-domain and 69.8% out-of-domain accuracy, matching 32B baselines and GPT-4.1-mini. Works in semi-supervised settings with only 10% labeled data.

Read source

Your take?

Reasoning Reinforcement learning Benchmarks Evals

Summary generated by Claude — human-verified

DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification

Other angles on this story