Analyzing Research on Sustainable Development Goals

Table of Contents

Making Sense of Research on SDGs
How the System Works
The Heavy Lifting: Topic Modeling
Why Use Large Language Models?
Visualizing the Results
Understanding the Sustainable Development Goals
Gathering the Data
The Cleaning Process
The TETYS Pipeline
Topic Modeling Explained
Using the Dashboard
The Topic Comparison Feature
Results and Findings
Insights from Each Macro-Area
Quality Check: Evaluating the Results
Manual Evaluation
The Power of Visualization
User-Friendly Exploration
Conclusion
Original Source
Reference Links

The world has its fair share of issues that make life a bit harder for everyone. In 2015, the United Nations came up with a list called the Sustainable Development Goals (SDGs) to help tackle these problems by 2030. These goals cover a range of important topics like gender equality, education, poverty, health, and climate change. They all aim to create a better and sustainable future for everyone.

Making Sense of Research on SDGs

With so much research being done, it can be overwhelming to sort through it all. That's where natural language processing (NLP) comes in. By using smart techniques, we can sift through Academic Papers to figure out what people are saying about the SDGs. We built a system that does just that. It fetches research papers, identifies common topics in them, and provides insights into how attitudes toward the SDGs have shifted over time.

How the System Works

Fetching Data: We retrieve information from the Scopus database, which is a treasure trove of academic papers. We focus on five groups of SDGs to make our work more manageable.
Finding Topics: Once we have the data, we use a method called Topic Modeling. This fancy term simply means we look for patterns in the text to determine what main topics people talk about.
Exploring Topics: We allow users to explore these topics through easy keyword searches. Users can also see how the frequency of these topics changes over time.

The Heavy Lifting: Topic Modeling

We employ a tool called BERTopic, which is like a superhero for topic modeling. This tool helps us look at thousands of papers and identify hundreds of topics. We’ve made some improvements to this tool, including better ways to represent the text so it makes more sense. We even created a system to find the best settings to use with our data, making our work more efficient.

Why Use Large Language Models?

In simple terms, we use something powerful for our text analysis-large language models (LLMs). These are advanced tools trained on tons of data. They help us create better representations of the content in academic abstracts, which makes our insights more meaningful.

Visualizing the Results

Once we've processed everything, we present the findings through interactive dashboards. Users can see how different topics have developed over the years and explore keywords related to those topics. It’s like having a time machine for academic research!

Understanding the Sustainable Development Goals

The SDGs are a list of 17 goals that target pressing global issues. These goals are not just random ideas; they are carefully designed to work together to improve life on Earth. We have grouped these goals into five main areas to make our analysis clearer:

Basic Human Needs and Well-being
Environmental Sustainability
Economic Development and Employment
Equality and Social Inclusion
Global Partnerships and Peace

By categorizing them, we can more easily identify trends and topics in the literature.

Gathering the Data

We access the Scopus database because it’s one of the largest sources of academic papers. When we pull this data, we specifically look for English abstracts that cover our 5 areas of interest. We clean the data to ensure quality, which means we remove duplicates and only keep relevant information.

The Cleaning Process

When we gather all this information, it’s important to sift through it carefully. We check for missing details like titles and publication dates, and we make sure there’s no duplicate content. This process ensures we have high-quality data to work with.

The TETYS Pipeline

Our system, which we’ve creatively named TETYS (Topics Evolution That You See), is made up of two parts:

Building the Topic Model: This is where we create a solid understanding of the topics based on the research papers.
Exploring and Visualizing: This part lets users interact with the findings. They can see word clouds, search for specific keywords, and even compare the changes in topics over time.

Topic Modeling Explained

At its core, topic modeling is about grouping documents that discuss similar themes. It’s like putting together a collection of books on the same subject. We use a process that involves this sequence:

Convert documents into data: We turn the text of the abstracts into a format we can analyze.
Reduce complexity: We simplify the data so it’s manageable.
Group similar documents: We cluster the data based on similarities.
Tokenization: We break down the text into keywords.
Identify important words: We keep track of which words are most significant in the context of each topic.

Using the Dashboard

With the TETYS dashboard, users can select any of the five macro-areas we mentioned earlier. They can search for keywords or simply view trending topics. The dashboard provides different views, including individual topics and comparisons between them.

The Topic Comparison Feature

Users can pick multiple topics and see how they stack up against each other in terms of how often they appear in the literature over time. This feature allows for a more dynamic investigation into trends.

Results and Findings

As we ran our analyses, we identified a hefty number of topics across the five macro-areas:

Basic Human Needs and Well-being: 550 topics
Environmental Sustainability: 856 topics
Economic Development and Employment: 181 topics
Equality and Social Inclusion: 136 topics
Global Partnerships and Peace: 167 topics

The number of topics correlates with the amount of research available in each area. For instance, the first two areas had a lot of research, resulting in many identified topics.

Insights from Each Macro-Area

With our analysis, we were able to connect certain topics to the specific SDGs. For example:

In Basic Human Needs, topics related to clean water and health emerged prominently.
In Environmental Sustainability, discussions about renewable energy and clean transportation were highlighted.
For Economic Development, research about job growth and financial stability was significant.
Equality and Social Inclusion focused on issues of gender-based violence and reduced inequality.
Finally, Global Partnerships featured diverse topics, showing the multitude of ways partnerships can be approached.

Quality Check: Evaluating the Results

Every model needs to be checked for quality. For our topic modeling results, we compared our current approach to a previous one, looking for improvements in how well each method identified topics.

Manual Evaluation

We performed manual checks on a sample of abstracts to see if our new configurations were actually better. With two trained evaluators assessing the results, we looked at aspects like precision and recall:

Precision: How many assigned topics were correct?
Recall: How many correct topics were identified?

Our new model scored better in these evaluations, meaning it was generally better at identifying topics accurately.

The Power of Visualization

Visuals can help make sense of complex data. Our dashboard uses different forms of visualization to present findings about the trends in the literature. Word clouds, topic frequency graphs, and more allow users to grasp the information quickly.

User-Friendly Exploration

Our dashboard is designed for easy navigation. Users can explore topics by selecting a macro-area, inputting keywords, or looking at trending topics. The information is presented in a clear and informative way, making it accessible to anyone interested.

Conclusion

In summary, the TETYS system allows for a comprehensive analysis of research literature related to the Sustainable Development Goals. By utilizing advanced tools and methods, we can significantly identify and explore trends over time.

This setup not only enhances our understanding of the literature but also makes it easy for users to engage with the data in a meaningful way. Whether researchers, students, or professionals, everyone can benefit from these insights as we collectively work toward a better future.

And remember, if saving the world was easy, we’d all be doing it by now! So let’s keep exploring, analyzing, and learning together!

Analyzing Research on Sustainable Development Goals

Making Sense of Research on SDGs

How the System Works

The Heavy Lifting: Topic Modeling

Why Use Large Language Models?

Visualizing the Results

Understanding the Sustainable Development Goals

Gathering the Data

The Cleaning Process

The TETYS Pipeline

Topic Modeling Explained

Using the Dashboard

The Topic Comparison Feature

Results and Findings

Insights from Each Macro-Area

Quality Check: Evaluating the Results

Manual Evaluation

The Power of Visualization

User-Friendly Exploration

Conclusion

Reference Links

Referenced Topics

More from authors

Similar Articles

Analyzing Research on Sustainable Development Goals

#Making Sense of Research on SDGs

#How the System Works

#The Heavy Lifting: Topic Modeling

#Why Use Large Language Models?

#Visualizing the Results

#Understanding the Sustainable Development Goals

#Gathering the Data

#The Cleaning Process

#The TETYS Pipeline

#Topic Modeling Explained

#Using the Dashboard

#The Topic Comparison Feature

#Results and Findings

#Insights from Each Macro-Area

#Quality Check: Evaluating the Results

#Manual Evaluation

#The Power of Visualization

#User-Friendly Exploration

#Conclusion

Reference Links

Referenced Topics

More from authors

Similar Articles

Making Sense of Research on SDGs

How the System Works

The Heavy Lifting: Topic Modeling

Why Use Large Language Models?

Visualizing the Results

Understanding the Sustainable Development Goals

Gathering the Data

The Cleaning Process

The TETYS Pipeline

Topic Modeling Explained

Using the Dashboard

The Topic Comparison Feature

Results and Findings

Insights from Each Macro-Area

Quality Check: Evaluating the Results

Manual Evaluation

The Power of Visualization

User-Friendly Exploration

Conclusion