What does "ReAGent" mean?
Table of Contents
ReAGent is a method designed to help understand which words or tokens are important when a language model creates text. Traditional methods often look at how different parts of the input affect the final output, but these can be complex and vary between different models.
How It Works
ReAGent changes one word at a time in a sentence and checks how this affects the model's ability to predict the next word. If changing a word leads to a big change in the prediction, that means the word was important. If there is little change, the word was likely not that important.
Why It's Useful
This method can be used with any language model without needing to change the model or access its internal details. This makes it easier to apply in various situations. In tests, ReAGent shows better accuracy in identifying important words compared to other methods.
Overall, ReAGent helps make sense of how language models work by highlighting which words truly matter in the text they generate.