Developing a Thai Financial Language Model

Table of Contents

The Rise of Large Language Models
Filling the Gap
How We Did It
Building the Model
Enhancing Training
A Quick Overview of Our Work
Financial Domain LLMs
What is the Investment Consultant License Exam?
Plain Product (P1)
Complex Product 1 (P2)
Complex Product 2 (P3)
The Machinery Behind ReLoRA
Preparing the Data
Breaking It Down
Smart Data Augmentation
Self-Supervised Data Augmentation
Multiple System Prompts Augmentation
Multiple-Choice Shuffling
Multi-LLM Response Generation
Question-Answer Generation from Markdown
Optimizing the Model
Continual Pre-Training
Supervised Fine-Tuning
Direct Preference Optimization
Experimental Setup
Training Dataset
Public Investment Consultant Practice Exam
Results
Conclusion
Acknowledgments
Original Source

Large Language Models (LLMs) are the superheroes of text tasks. They can handle many things well. However, when it comes to niche fields like finance, they trip over fancy jargon and local rules. Models like FinGPT and BloombergGPT aren't cut out for the Thai finance scene. They don’t know how to deal with the local money talk.

To fix this, we've whipped up a special Thai Financial LLM using exam questions from the Investment Consultant exam in Thailand. Given that our dataset was smaller than we'd like, we jazzed it up with some fancy tricks like Data Augmentation, ReLoRA for speedy training, and a few others to make sure it understood Thai finance better. We put the model through mock exams to see how it performed, and it did pretty well, scoring 72% on its first two levels and 84% on the third.

The Rise of Large Language Models

In the past few years, LLMs have gotten pretty good at many tasks, especially conversations. These models learn general stuff from a lot of text. One of the stars of this show is Llama 3.1. It has been acing conversation tasks without needing a cheat sheet.

But here’s the kicker: LLMs can struggle with tricky, specialized terms in certain fields. They get lost when faced with financial lingo, which is something we really need in finance. They need to grasp the meaning behind complex terms and calculations, all while following local rules. But hey, no worries!

Newer models, like FinGPT and BloombergGPT, are stepping up their game. Still, they don’t quite understand the Thai finance landscape. There's a gap that needs filling.

Filling the Gap

We saw this gap and thought, "Why not build a model that actually gets Thai finance?" So, we took the Investment Consultant exam from the Stock Exchange of Thailand to use as our training ground. But since we were working with a small dataset, we went all out on data augmentation. This magic trick basically multiplies our data to make our model smarter.

We used a method called ReLoRA to make training faster and more efficient. On top of that, two special training sessions were designed to get the model ready for real-life exam situations. The results were impressive-our model passed with flying colors!

How We Did It

Building the Model

We started from scratch and built a language model focusing on the Thai financial domain. To mix things up, we took the Investment Consultant exam dataset and added more data through clever augmentation techniques.

Enhancing Training

We made it easier for the model to learn using ReLoRA. This technique lets us train big models faster while keeping them strong. By using continued pretraining, we ensured the model was well-versed in finance basics before diving deeper into specific topics. And for fine-tuning, we used Rank-Stabilized LoRA, which is just a fancy way of saying we kept things stable while making improvements.

We also created two ways to train: one that mimicked real exam conditions and another that helped the model learn from its mistakes. With these strategies, our model was fine-tuned to tackle any question thrown at it.

A Quick Overview of Our Work

Thai Financial LLM Development: We built a model just for Thai finance using the Investment Consultant exam.
Data Augmentation: We employed techniques to increase our limited dataset, making our model smarter.
Efficient Training: We used ReLoRA to get the most out of our training time and resources while ensuring the model learned effectively.
Exam Simulation and Feedback: We created a realistic exam environment and used feedback to continuously improve the model.

With these techniques combined, we crafted an LLM that can tackle financial advisory questions like a pro!

Financial Domain LLMs

LLMs are useful for finance tasks since they can handle different language challenges. Each model has its strengths, like supporting multiple languages or being quick. But it’s not enough. They need to adapt to fit the specific needs of the finance world.

Some models like FinBERT focus solely on sentiment analysis within financial texts. FLUE and its offshoot FLANG-BERT act as benchmarks for financial understanding. BloombergGPT holds proprietary treasure troves of financial data to ace finance tasks, while FinGPT is all about making finance more accessible through open-source techniques.

However, many existing models fall short when it comes to Thai-specific knowledge. They often miss the mark on local rules and acceptance, which can lead to some awkward misunderstandings.

What is the Investment Consultant License Exam?

The Investment Consultant License Exam is a required test for professionals who want to give investment advice in Thailand. It has three levels: P1, P2, and P3. Each level builds on the previous one, making sure candidates know what they’re doing.

Plain Product (P1)

This basic level looks at three key areas:

Fundamental Knowledge: Things like investment environments and risk.
Related Rules and Regulations: Understanding the legal side.
Product Knowledge: This covers different financial products like stocks and bonds.

It consists of 100 multiple-choice questions, and you must score at least 70% to pass.

Complex Product 1 (P2)

This level dives deeper, focusing on complex financial products like structured bonds and mutual funds. It has 25 multiple-choice questions and also requires at least 70% to pass.

Complex Product 2 (P3)

This is the big leagues, covering derivatives like futures and options. It consists of 50 multiple-choice questions, and you again need at least 70% to pass.

The Machinery Behind ReLoRA

ReLoRA is a smart way to train big models without burning through resources. It works by using low-rank updates, which sounds fancy but basically means it gets the model to improve without exhausting your computer.

How Does It Work?

Initial Training Phase: Start with full-rank training to set a solid base.
Low-Rank Updates: Apply lighter updates to keep things moving.
Learning Rate Schedule: Reset the learning pace to keep training smooth.
Optimizer Resets: Refresh parts of the optimizer to avoid getting stuck.

This clever system not only speeds up the training process but also makes it less resource-intensive, which is music to anyone’s ears trying to save money.

Preparing the Data

Handling large documents can be tricky, especially when preparing data for training. We used a technique called Dynamic Markdown Chunking. This method cuts large documents into smaller, manageable pieces while keeping everything logical and on topic.

Breaking It Down

Initial Chunking: We chunk the document based on its headers, ensuring that each piece is complete in its context.
Further Splitting: If a chunk gets too big, we slice it down further using logical divisions like paragraphs.

This way, our model can digest the information more easily, keeping everything relevant.

Smart Data Augmentation

With our training dataset filled with exam questions and a decent amount of study materials, we needed to make sure our model stayed sharp and ready for anything. So, we employed several data augmentation tricks.

Self-Supervised Data Augmentation

To create reasoning data for exam questions, we made the model produce reasons for each answer choice. This way, it could learn from right answers and even the wrong ones.

Multiple System Prompts Augmentation

We presented the same exam content in different ways. This approach got the model used to a variety of scenarios, preparing it for different types of questions.

Multiple-Choice Shuffling

To keep the model focused on the questions and not the order of answers, we mixed up the answer choices. This way, it had to pay attention to the content rather than patterns.

Multi-LLM Response Generation

We harnessed the power of multiple models to produce various answers for each question, enriching our dataset and improving the model's learning.

Question-Answer Generation from Markdown

Using the structure of markdown documents, we generated question-answer pairs based on the headers and their corresponding content. This gave us a trove of meaningful questions and answers for training.

Optimizing the Model

Continual Pre-Training

We pre-trained the model on a part of our study materials using chunks of markdown data to help it grasp the basics of finance.

Supervised Fine-Tuning

We used two methods:

CoT on Reasoning: This method boosted the model's reasoning skills by making it explain the correct answers.
Question-Answer Fine-Tuning: Here, we trained with several question-answer pairs, improving its adaptability and generalization.

Direct Preference Optimization

We applied two variations of DPO to sharpen the model’s reasoning skills:

CoT on Reasoning: This variant helped the model generate the best explanations.
Zero-shot Learning with Shuffling: The focus here was on prioritizing content over position.

Experimental Setup

To see how well our model worked, we ran tests on public IC exams. We used various commercially available models and instruction-tuned foundational models to benchmark performance.

Training Dataset

Our dataset contained:

Mock Exams: A limited number of simulated tests that covered all three exam levels.
Study Materials: Over 1.3 million tokens worth of content covering many important financial topics.

Public Investment Consultant Practice Exam

We chose practice exams provided by the SET as our testing data. This allowed us to compare our results against known benchmarks seamlessly.

Results

After running our tests, the results showed a lively performance among the models. Commercial APIs like gpt-4o showed robust scores across all tests. But what was even more exciting was that our homegrown model, THaLLE-IC, held its own, especially in the trickier P3 exam.

Conclusion

In this report, we covered the journey of creating THaLLE-IC, a model specifically designed for the Thai financial domain. Through clever data and training strategies, we managed to equip it with the skills necessary to handle real-world exam questions.

While commercial models tend to shine across the board, THaLLE-IC proves that well-tuned open-source models can compete, offering promising performance at a fraction of the cost. As we move forward, it’s clear that with the right approach, we can make smart models even smarter without breaking the bank.

Acknowledgments

Thanks to everyone who supported us in bringing this project to life, especially our project managers and leading team members.

Developing a Thai Financial Language Model

The Rise of Large Language Models

Filling the Gap

How We Did It

Building the Model

Enhancing Training

A Quick Overview of Our Work

Financial Domain LLMs

What is the Investment Consultant License Exam?

Plain Product (P1)

Complex Product 1 (P2)

Complex Product 2 (P3)

The Machinery Behind ReLoRA

Preparing the Data

Breaking It Down

Smart Data Augmentation

Self-Supervised Data Augmentation

Multiple System Prompts Augmentation

Multiple-Choice Shuffling

Multi-LLM Response Generation

Question-Answer Generation from Markdown

Optimizing the Model

Continual Pre-Training

Supervised Fine-Tuning

Direct Preference Optimization

Experimental Setup

Training Dataset

Public Investment Consultant Practice Exam

Results

Conclusion

Acknowledgments

Referenced Topics

Similar Articles

Developing a Thai Financial Language Model

#The Rise of Large Language Models

#Filling the Gap

#How We Did It

#Building the Model

#Enhancing Training

#A Quick Overview of Our Work

#Financial Domain LLMs

#What is the Investment Consultant License Exam?

#Plain Product (P1)

#Complex Product 1 (P2)

#Complex Product 2 (P3)

#The Machinery Behind ReLoRA

#Preparing the Data

#Breaking It Down

#Smart Data Augmentation

#Self-Supervised Data Augmentation

#Multiple System Prompts Augmentation

#Multiple-Choice Shuffling

#Multi-LLM Response Generation

#Question-Answer Generation from Markdown

#Optimizing the Model

#Continual Pre-Training

#Supervised Fine-Tuning

#Direct Preference Optimization

#Experimental Setup

#Training Dataset

#Public Investment Consultant Practice Exam

#Results

#Conclusion

#Acknowledgments

Referenced Topics

Similar Articles

The Rise of Large Language Models

Filling the Gap

How We Did It

Building the Model

Enhancing Training

A Quick Overview of Our Work

Financial Domain LLMs

What is the Investment Consultant License Exam?

Plain Product (P1)

Complex Product 1 (P2)

Complex Product 2 (P3)

The Machinery Behind ReLoRA

Preparing the Data

Breaking It Down

Smart Data Augmentation

Self-Supervised Data Augmentation

Multiple System Prompts Augmentation

Multiple-Choice Shuffling

Multi-LLM Response Generation

Question-Answer Generation from Markdown

Optimizing the Model

Continual Pre-Training

Supervised Fine-Tuning

Direct Preference Optimization

Experimental Setup

Training Dataset

Public Investment Consultant Practice Exam

Results

Conclusion

Acknowledgments