This study investigates how circuit analysis techniques apply to a large language model.
― 5 min read
Cutting edge science explained simply
This study investigates how circuit analysis techniques apply to a large language model.
― 5 min read
Examining how AI interprets and interacts with the game Othello.
― 6 min read
Activation patching reveals insights into language models' outputs and behaviors.
― 5 min read
The study investigates universal neurons in GPT-2 models and their roles.
― 4 min read
Researchers investigate how models adapt when components are removed.
― 6 min read
A closer look at causal attribution methods for large language models.
― 6 min read
Sparse autoencoders enhance the interpretability of AI systems and their decision-making processes.
― 18 min read
Learn how transcoders help clarify complex language models.
― 5 min read
This article examines how certain neurons affect uncertainty in language model predictions.
― 6 min read
This study uses sparse autoencoders to interpret attention layer outputs in transformers.
― 6 min read
JumpReLU SAEs improve data representation while keeping it simple and clear.
― 7 min read
Gemma Scope offers tools for better understanding language models and improving AI safety.
― 6 min read
New metrics improve understanding of Sparse Autoencoders in neural networks.
― 7 min read
BatchTopK sparse autoencoders improve language processing through smart data selection.
― 5 min read