Pepa Atanasova

A study comparing Instance and Neuron Attribution methods in language models.

2025-08-15T11:50:48+00:00 ― 7 min read

A unified framework to assess explanation types for better model understanding.

2025-07-25T18:08:48+00:00 ― 5 min read

A study on how language models handle conflicting information.

2025-07-07T09:35:24+00:00 ― 6 min read

New framework enhances natural language explanations for AI models, fostering user trust.

2025-03-03T18:35:06+00:00 ― 7 min read