Language Models and Brain Activity: A Study

Investigating connections between language models and brain responses during story listening.

Table of Contents

Creating Similarity Pairs with Language Models
Finding the Right Threshold for Estimations
Comparing Next-Token Accuracy
Insights from fMRI Data
Fuzzy Matching in Brain Responses
Comparing Prediction Performance
Real-World Applications
Original Source
Reference Links

We are training two different language models. One uses the GPT-2 tokenizer, and the other uses LLaMA-2. The GPT-2 version has four transformer layers, while the LLaMA-2 version has three. Think of these models like two different cars, both built for the same road but with slightly different engines.

Relative positioning is important when comparing words, so we use something called Relative Positional Encoding. It allows the model to keep track of where each word is in a sentence. The GPT-2 version has a limit of 32 positions, while the LLaMA-2 version can handle 64. It's like having a bigger parking lot for more cars. The vocabulary for both models comes from their respective predecessors, ensuring everything fits nicely.

Creating Similarity Pairs with Language Models

To train these models, we use LLaMA-2 as a mentor. We gather lots of text from different sources to feed into each model, depending on the tokenizer being used. During training, we randomly pick sequences of 32 or 64 words, with a batch size of 128 or 256. This means we are looking at a massive number of word possibilities in each round of training.

We then create pairs of words that are similar based on certain calculations. Think of similarity pairs as pairs of friends who hang out together. We compare how often they are found together in the training material. The models learn to predict the next word based on what they've seen so far. They use a combination of different loss functions to train, meaning they aim to get closer to the right predictions over time. This training continues for quite a while on some high-powered GPUs, which are like fancy calculators for this kind of work.

Finding the Right Threshold for Estimations

Once we have our models, we need to set a threshold for effective predictions. This threshold helps determine when the model is doing well. To find the best number for this threshold, we tried different settings using a training set with 100 million tokens. It’s like testing out various recipes to find the tastiest one.

We looked at six datasets to see how different settings affected model performance. For each dataset, we used it for testing while the others helped in building the main model. We then compared how well the models did when the effective threshold was set to different values. We found that the GPT-2 tokenizer worked best when set to 8, while the LLaMA-2 tokenizer performed better at 9.

Comparing Next-Token Accuracy

In our evaluations, we used various datasets as a reference. For some datasets, we built our own data references, while for others, we used publicly available models. We conducted tests to check how well the models performed at predicting the next word in a sequence.

When comparing the models, we found that while one might take longer to generate responses, it often produced better outputs. This is like waiting longer for a delicious meal at a restaurant instead of a quick snack. The longer wait may lead to a more satisfying experience.

We also looked at examples where the models could match words exactly and where they had to rely on fuzzy matches. This is like trying to recognize a friend in a crowd-if you can’t see them clearly, you might still get a sense of who they are based on their clothing or hairstyle.

Insights from fMRI Data

We also looked at Brain Activity using fMRI, a method that helps see how the brain reacts while people listen to stories. We collected data from three people while they enjoyed some podcasts. There was no need for them to respond; they just listened.

Over several scanning sessions, subjects heard about 20 hours of unique stories. Each session provided a lot of data points we could analyze. We did some fancy measuring to see how well the brain responded to the stories and created a model predicting brain activity based on the words listened to.

To analyze the data, we sorted out noise and made sure everything was aligned properly. We carefully removed parts of the recordings that might confuse our conclusions. The goal here was to see if understanding language could be linked to specific brain functions.

Fuzzy Matching in Brain Responses

In our study of brain data, we created a fuzzy matching model. This model helps in figuring out how closely words relate to one another, even if they are not exact matches. We used some smart math to compare how likely the next word is based on its similarity to the previous ones.

By smoothing out our data to fit the brain’s timing, we could make more accurate predictions of brain responses that correspond to the words being heard. This helped show how different words could trigger similar brain activity, even if they weren't the same.

Comparing Prediction Performance

Next, we tested how well the fuzzy matching model performed against the exact matching model. Despite our efforts, the fuzzy induction model didn’t surpass the exact matching model by much. This could be because the brain data is noisy and not always easy to interpret.

Think of it this way: if you’re listening to a song in a crowded room, you might hear the tune but not catch every word. The fuzzy model is like that-it can pick up on the general vibe but may miss the fine details. The results showed that while similar words could activate the same brain areas, the differences were often subtle.

Real-World Applications

Understanding language and brain connections may help in different fields. For instance, it could assist in improving teaching methods, enlightening how to assist people with language difficulties, or even contributing to artificial intelligence that mimics human understanding in more precise ways.

In summary, as we develop these models and explore the brain's responses, it becomes clearer how language works at various levels-from the algorithms that drive machine learning to the neural circuits in our brains. It's an exciting field, filled with possibilities, and while the learning process can be complex, it can also be quite entertaining!

Language Models and Brain Activity: A Study

Creating Similarity Pairs with Language Models

Finding the Right Threshold for Estimations

Comparing Next-Token Accuracy

Insights from fMRI Data

Fuzzy Matching in Brain Responses

Comparing Prediction Performance

Real-World Applications

Reference Links

Referenced Topics

More from authors

Similar Articles

Language Models and Brain Activity: A Study

#Creating Similarity Pairs with Language Models

#Finding the Right Threshold for Estimations

#Comparing Next-Token Accuracy

#Insights from fMRI Data

#Fuzzy Matching in Brain Responses

#Comparing Prediction Performance

#Real-World Applications

Reference Links

Referenced Topics

More from authors

Similar Articles

Creating Similarity Pairs with Language Models

Finding the Right Threshold for Estimations

Comparing Next-Token Accuracy

Insights from fMRI Data

Fuzzy Matching in Brain Responses

Comparing Prediction Performance

Real-World Applications