Atticus Geiger

Assessing the accuracy of neuron explanations in language models reveals significant flaws.

2025-09-24T10:54:24+00:00 ― 5 min read

Explore how interpretability illusions affect our view of neural networks.

2025-09-14T21:24:42+00:00 ― 7 min read

A study assessing various methods for interpreting language model neurons.

2025-09-03T17:28:12+00:00 ― 7 min read

An in-depth look at Gated Recurrent Units in sequence learning.

2025-06-25T04:22:06+00:00 ― 6 min read

This article assesses the effectiveness of sparse autoencoders in knowledge representation about cities.

2025-06-16T21:25:12+00:00 ― 5 min read