arXiv cs.AI·26 May 2026

LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition

Signal

Hype

In three linesLC-ERD is a self-alignment framework for LLMs that mines latent logical structures via consistency-regulated reward decomposition. Addresses three challenges: label noise from mimetic bias, coarse-grained supervision, and distributional collapse. Uses Variational Logic Potential and multi-agent value decomposition based on IGM principle.

Read source

Your take?

Reasoning Reinforcement learning Alignment Papers

Summary generated by Claude — human-verified

LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition

Other angles on this story