Simple Science

Cutting edge science explained simply

# Biology # Bioinformatics

New Method for Better Health Predictions

MultiPopPred improves disease risk assessments for underrepresented populations.

Ritwiz Kamal, Manikandan Narayanan

― 5 min read


MultiPopPred Enhances MultiPopPred Enhances Health Insights for diverse populations. Revolutionizing disease risk prediction
Table of Contents

In the world of genetics, our differences can sometimes lead to trouble. Some groups of people might be more at risk for certain Diseases due to their genes and environment. This is particularly true for complex diseases, such as type 2 diabetes and heart problems, which are caused by many small genetic factors working together. For a long time, research has focused mainly on individuals of European descent, leaving out groups like South Asians who may have different health risks.

The Problem

Traditionally, scientists use some big studies called genome-wide association studies (GWAS) to find connections between genes and diseases. These studies often focus on large groups of people from a single ancestry. While this gives plenty of helpful insights, it also means that other groups are underrepresented. For instance, many GWAS with South Asian Populations only include a few hundred to a few thousand individuals, which isn't enough to make reliable Predictions about disease risk.

When scientists try to apply findings from European studies to South Asians, they face a tricky situation. It can lead to misunderstandings about health risks and may even worsen existing health disparities. So, researchers are now looking for new ways to better include underrepresented populations in this important area of study.

The Quest for a Solution

One solution to the lack of Data is to simply gather more information from South Asian individuals. However, that can be time-consuming and expensive. Instead, some researchers are trying to use information from other populations where there is more data available. They want to see how genetic risks can be shared and used to help predict disease in less-studied populations like South Asians.

This is where MultiPopPred comes into play. It's a clever method designed to use data from multiple populations simultaneously, rather than relying on one group. By doing this, it hopes to provide better predictions for those who are often left out.

What is MultiPopPred?

MultiPopPred is like a new recipe in the kitchen of genetic research. Imagine a chef who needs to make a delicious dish but only has a few ingredients. Instead of using just those few, they call their friends over and borrow some of their spices, vegetables, and sauces. This way, they can create something tasty and appealing.

MultiPopPred works similarly by integrating information from multiple well-studied populations to help predict disease risks for a target population, such as South Asians. It has three versions, tailored to different data situations, and uses a smart method to improve predictions.

How Does It Work?

MultiPopPred employs a method called penalized regression. Think of this as a refined way to weigh and mix the data coming from different populations. When using this method, it pulls together information from various groups to find a statistically reliable answer for predicting disease risks in the target population.

The three versions of MultiPopPred are as follows:

  1. MultiPopPred-Vanilla: This version uses data from the target population along with summary data from other populations. It gives equal importance to each population's data, mixing everything together to come up with a solid estimate.

  2. MultiPopPred-Admixture: This version takes things a step further by looking at how much of the target population's makeup comes from each of the auxiliary populations. It weighs the data accordingly to create a more accurate prediction.

  3. MultiPopPred-ExtLD: This version is designed for scenarios where individual data isn't available. Instead, it uses summary statistics and an external reference to make estimates.

No matter which version is being used, MultiPopPred aims to produce better predictions by effectively utilizing data from multiple sources.

Testing the Method

To see how well MultiPopPred works, researchers ran a series of tests. They compared it to other existing methods, focusing on different settings where sample sizes varied dramatically.

So, how did MultiPopPred perform? Let's just say it did pretty well, especially when faced with a situation where there were very few samples from the target population. It often outshined other methods, showing a remarkable improvement in accuracy.

For example, in situations with low target samples, MultiPopPred showed a 65% improvement in predictions compared to other methods. Overall, it managed to improve predictions by an average of 21% over different settings. This performance makes MultiPopPred a promising tool to help bridge the gap in disease risk assessment for underrepresented populations.

Why Is This Important?

Understanding how genetics affects health is crucial for improving healthcare and disease prevention. As the research community aims to get more diverse, methods like MultiPopPred can help ensure that everyone is included in the conversation. Not only does it help provide better insights for underrepresented groups, but it also reduces the risk of miscalculations and health disparities that could arise from relying solely on data from one group.

Conclusion

MultiPopPred represents an exciting step forward in the field of genetic research. By borrowing knowledge from well-studied populations, it stands to enhance disease risk predictions, particularly for underrepresented groups.

With more accurate predictions, healthcare providers can make better decisions, tailor interventions, and ultimately improve health outcomes for everyone. In a world where differences can sometimes be a source of division, MultiPopPred shows us that sharing knowledge and resources can lead to better health for all.

Who knew that combining data could be so deliciously effective? It’s a scientific feast that aims to serve everyone at the table!

Original Source

Title: MultiPopPred: A Trans-Ethnic Disease Risk Prediction Method, and its Application to the South Asian Population

Abstract: Genome-wide association studies (GWAS) aimed at estimating the disease risk of genetic factors have long been focusing on homogeneous Caucasian populations, at the expense of other understudied non-Caucasian populations. Therefore, active efforts are underway to understand the differences and commonalities in exhibited disease risk across different populations or ethnicities. There is, consequently, a pressing need for computational methods that efficiently exploit these population specific vs. shared aspects of the genotype-phenotype relation. We propose MultiPopPred, a novel trans-ethnic polygenic risk score (PRS) estimation method, that taps into the shared genetic risk across populations and transfers information learned from multiple well-studied auxiliary populations to a less-studied target population. MultiPopPred employs a specially designed Nesterov-smoothed penalized shrinkage model and a L-BFGS optimization routine. We present three variants of MultiPopPred based on the availability of individual-level vs. summary-level data and the weightage of each auxiliary population. Extensive comparative analyses performed on simulated genotype-phenotype data reveal that MultiPopPred improves PRS prediction in the South Asian population by 65% on settings with low target sample sizes and by 21% overall across all simulation settings, when compared to state-of-the-art trans-ethnic PRS estimation methods. This performance trend is promising and encourages application and further assessment of MultiPopPred under other simulation and real-world settings.

Authors: Ritwiz Kamal, Manikandan Narayanan

Last Update: 2024-12-01 00:00:00

Language: English

Source URL: https://www.biorxiv.org/content/10.1101/2024.11.26.625410

Source PDF: https://www.biorxiv.org/content/10.1101/2024.11.26.625410.full.pdf

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to biorxiv for use of its open access interoperability.

Similar Articles