An overview of mechanistic interpretability in transformer-based language models.
― 7 min read
Cutting edge science explained simply
An overview of mechanistic interpretability in transformer-based language models.
― 7 min read
Studying how language models respond to fictional questions reveals shared characteristics.
― 5 min read