Back to feed
arXiv cs.LG·

Towards Verifiable Transformers: Solver-Checkable Circuit Explanations

Signal
72
Hype
18
In three linesVerifiable Transformers framework converts task-localized Transformer circuits into solver-checkable formal claims. Extracts circuits and verifies functional equivalence, edge necessity, invariance, and robustness via SMT encoding. Demonstrates direct verification on symbolic tasks and surrogate-mediated verification at GPT-2 scale with SMT-representable operators (Signed L1 BandNorm, sparsemax, LeakyReLU).
Read source
Your take?
ReasoningAI safetyPapers

Summary generated by Claude — human-verified