Language models can explain neurons in language models
Signal
75
Hype
25
In three linesOpenAI uses GPT-4 to automatically generate explanations for neuron behavior in large language models and score those explanations. A dataset of these explanations and scores for every neuron in GPT-2 is released.Read source
Your take?
Summary generated by Claude — human-verified