LC-ERD: Mining Latent Logic for Self-Evolving Reasoning via Consistency-Regulated Reward Decomposition
Signal
72
Hype
28
In three linesLC-ERD is a self-alignment framework for LLMs that mines latent logical structures via consistency-regulated reward decomposition. Addresses three challenges: label noise from mimetic bias, coarse-grained supervision, and distributional collapse. Uses Variational Logic Potential and multi-agent value decomposition based on IGM principle.Read source
Your take?
Summary generated by Claude — human-verified