Arthur Conmy

This study uses sparse autoencoders to interpret attention layer outputs in transformers.

2025-07-24T13:50:18+00:00 ― 6 min read

JumpReLU SAEs improve data representation while keeping it simple and clear.

2025-07-10T09:44:36+00:00 ― 7 min read

Gemma Scope offers tools for better understanding language models and improving AI safety.

2025-06-30T01:33:06+00:00 ― 6 min read

A method to improve steering vector effectiveness in language models.

2025-05-31T10:57:27+00:00 ― 5 min read