LEGO: An LLM Skill-Based Front-End Design Generation Platform
In three linesLEGO is a modular platform for LLM-based digital front-end design generation. It decomposes the flow into 6 steps and extracts 42 reusable circuit skills. On 41 hard VerilogEval v2 problems where GPT-5.2-codex fails, LEGO achieves 80.5% Pass@1 vs 0% baseline, outperforming hierarchy-verilog (+14.6%) and VerilogCoder (+2.5%).
## LEGO: Modularity as the answer to monolithic EDA agent failure
### 1. The prior state
The LLM-based RTL generation landscape was fragmented: each system — VerilogCoder, hierarchy-verilog, MAGE — solved a specific sub-problem without capitalizing on others' solutions. The result was duplicated engineering effort, no reuse of working debugging strategies, and benchmarks where each tool plateaued on different problems. GPT-5.2-codex itself, pushed to "extra-high reasoning effort", fails on 41 problems from the hard subset of VerilogEval v2 — Pass@1 = 0.000. That is the floor from which LEGO is evaluated.
### 2. What LEGO actually does
LEGO decomposes the digital front-end flow into **6 independent steps** formalized as a finite state machine, and represents every agent capability as a standardized, composable, plug-and-play **circuit skill**. Three core components:
- **Circuit Skill Builder**: automates skill extraction with linear scalability. The authors surveyed >100 papers, selected 11 representative open-source projects, and extracted **42 executable circuit skills**. - **Agent Skill RAG**: sub-millisecond retrieval with no embedding model — a notable architectural choice that eliminates a heavy dependency and reduces retrieval latency. - **Cross-project skill composition**: skills from different projects can be combined, which is the real generalization test.
Everything is open-source on GitHub, platform and all 42 skills immediately available.
### 3. The numbers that matter
On the 41 problems where GPT-5.2-codex fails at 0%: - **LEGO (individual skills)**: Pass@1 = **0.805** (+80.5 absolute points) - **LEGO (cross-project compositions)**: Pass@1 = **0.805** (identical, validating composition robustness) - **hierarchy-verilog**: beaten by **+14.6 points** - **VerilogCoder**: beaten by **+2.5 points** - **MAGE**: tied — LEGO matches MAGE without its monolithic architecture
The fact that cross-project compositions reach exactly the same score as individual skills (0.805 vs 0.805) is reassuring — composition doesn't degrade performance — but mildly disappointing: one might expect additive synergy. This suggests the ceiling may lie in the benchmark itself or in skill formulation, not in the architecture.
### 4. Winners, losers, open questions
**Potential losers**: Teams maintaining monolithic EDA agents (VerilogCoder, hierarchy-verilog) see their competitive advantage reduced. If LEGO becomes a reference platform, future contributions will flow toward enriching the skill library rather than building closed systems. Proprietary EDA solution vendors banking on LLM pipeline opacity are also exposed.
**Winners**: Hardware teams without resources to maintain a full EDA agent can now assemble pipelines from 42 validated skills. The Circuit Skill Builder's linear scalability means the library can grow without exponential architectural cost.
**What remains open**: VerilogEval v2 covers digital front-end, but the full EDA flow includes synthesis, placement, routing — stages where LEGO makes no claims. Generalization to industrial (non-benchmark) designs is untested. The embedding-free RAG is fast, but its precision on semantically close skills is not discussed. Finally, 42 skills extracted from 11 open-source projects is a limited corpus: coverage of real industrial RTL patterns is unknown.
LEGO is a serious engineering response to a real fragmentation problem. The 80.5% Pass@1 on a subset that GPT-5.2-codex cannot solve is the hardest result to dismiss in recent EDA-LLM literature.
Summary generated by Claude — human-verified