Using AI and Knowledge Graphs to Repurpose Drugs for Rare Diseases
Exploring AI's role in finding new uses for existing drugs in rare diseases.
― 8 min read
Table of Contents
- Related Work
- Knowledge Graphs in Drug Repurposing
- Explainable AI (XAI)
- Method Overview
- rd-explainer Pipeline
- Data Sources
- Machine Learning and Explanations
- Node Features and Data Splitting
- GNN Model
- Link Prediction
- Graph-Based Explanations
- Evaluation and Performance Metrics
- Model Evaluation
- Explanation Evaluation
- Rare Disease Knowledge Graphs
- Graph Construction
- Graph Topology
- Results and Findings
- Performance Metrics
- Predictions Validation
- Explanation Analysis
- Generalizability
- Discussion
- Limitations
- Future Directions
- Conclusion
- Original Source
- Reference Links
Making new medicines can be hard and often doesn’t lead to a successful product. Studies show that about 90% of new drugs fail to get approved during testing. This is a huge waste of time and money, which means no profits for pharmaceutical companies. The problem is even greater when it comes to rare diseases. Companies might hesitate to invest in drugs for conditions that affect only a few people. Yet, around 7,000 rare diseases exist, with effective treatments available for only about 5%. In Europe alone, around 36 million individuals are affected by rare diseases.
To address these challenges, Drug Repurposing strategies have emerged. This involves using drugs that have already been approved for different conditions. By doing so, companies can skip many of the expensive and lengthy steps needed for clinical trials. Recently, new methods that use computer technology and artificial intelligence (AI) have shown promise in drug repurposing. One interesting approach is using networks that represent how different molecules, genes, proteins, and diseases are connected. This method can reveal relationships that might not be obvious otherwise.
Despite the potential of AI, many people are still uncertain about trusting its decisions. This is especially true in healthcare, where choices can greatly affect lives. Providing clear reasons for AI decisions can help researchers form testable research ideas and promote knowledge in the lab. The EU is also pushing for regulations that guarantee people the right to get explanations for decisions made by automated systems.
In this article, we will investigate how AI can offer Predictions and explanations in the context of drug repurposing for rare diseases. Our main aim is to create a system to identify existing drugs that may help treat the symptoms of rare diseases. We’ll focus on advanced AI methods and ensure they also provide understandable reasons for their predictions.
Related Work
Knowledge Graphs in Drug Repurposing
Current drug repurposing approaches often rely on graph-based methods and AI to find potential drug candidates. One key benefit of using graphs is their ability to combine data from multiple sources. This is crucial for rare diseases, where data can be limited and scattered. Recent research has shown how knowledge graphs can be used to find drug candidates for conditions like COVID-19.
Different AI techniques can analyze these knowledge graphs, including methods that enhance and transform data relationships. In our work, we combined certain machine learning techniques to allow for updates without needing to re-train the model. This flexibility is important in drug repurposing, as new information is constantly emerging about drugs, genes, and diseases.
Explainable AI (XAI)
One method used to provide explanations in graph-based models is (Graph)LIME, which examines how small changes in input affect predictions. Another Explainability method is CRIAGE, which offers explanations in the form of rules. However, for our work, we chose to use GNNExplainer, which identifies the key features and connections responsible for specific predictions. This method is adaptable, which means it can work well with different types of graph models.
Method Overview
rd-explainer Pipeline
We developed a method called rd-explainer specifically for drug repurposing in rare diseases. The process consists of three main parts:
- Knowledge Graph Construction: This part creates a knowledge graph related to a specific rare disease and the goal of finding drug repurposing options.
- Prediction Module: In this stage, we use a neural network model to predict which drugs might help treat symptoms of the rare disease.
- Explainer Module: This last part identifies and shows the important connections that explain why a certain drug was predicted for a particular symptom.
To create the knowledge graph, we gather data from various reliable sources focused on disease information and existing drugs. We then organize this data into a graph structure. Each part of the graph represents different aspects of the disease and potential treatments.
Data Sources
We drew information from several databases that provide details about diseases, genes, and drugs. One key source is the Monarch knowledge base, which collects data on various biological elements. We also used databases specifically focused on drug properties and targets. By integrating these different data sources, we aimed to create a comprehensive knowledge graph that could support our predictions effectively.
Machine Learning and Explanations
Node Features and Data Splitting
Our method builds on a neural network approach, where we create representations for each node in the graph. Properly featuring each node is vital for accurately predicting drug-symptom links. We split the data into training, validation, and test sets to ensure our model can learn effectively and be evaluated accurately.
GNN Model
We utilized a specific type of neural network called GraphSAGE to create embeddings for the nodes within our knowledge graphs. By using these embeddings, we can predict connections between drugs and symptoms. This algorithm is beneficial because it can efficiently handle larger graphs.
Link Prediction
Our model generates scores to estimate the likelihood that a drug can effectively address certain symptoms. This scoring helps us rank potential drug candidates based on their predicted efficacy.
Graph-Based Explanations
To understand the predictions made by our model, we applied GNNExplainer to outline the relationships between drugs and their corresponding symptoms. We refined the explanations to ensure they highlighted meaningful connections, providing clarity for researchers when they look to validate these predictions.
Evaluation and Performance Metrics
Model Evaluation
To test the performance of our model, we compared its results with several standard benchmarks in the field. We used common metrics such as precision, recall, and F1-Score to measure how well our model performs in predicting links between drugs and symptoms.
Explanation Evaluation
We also assessed the explanations produced by our model by identifying whether they were complete or incomplete. A complete explanation provides a clear connection between a drug and a symptom, while an incomplete one shows disjointed information. We conducted literature searches to determine whether the connections suggested by our model were supported by existing scientific research.
Rare Disease Knowledge Graphs
Graph Construction
We constructed two different knowledge graphs focusing on Duchenne muscular dystrophy (DMD). The first graph contained essential nodes directly related to the disease. The second graph included additional information about various disease symptoms, leading to richer data.
Graph Topology
The features of both knowledge graphs were examined, including the number of nodes and connections. Each graph demonstrated different characteristics, such as clustering behavior and connectivity, which can influence how well the model performs and how comprehensible the explanations are.
Results and Findings
Performance Metrics
Both knowledge graphs demonstrated strong performance when tested for predicting drug-symptom links. The results indicated high scores in various metrics, showcasing the effectiveness of our approach.
Predictions Validation
By checking our predictions against available literature, we found that some drug candidates proposed by our model have been previously documented for treating specific symptoms. This validation strengthens the credibility of the predictions made by our approach.
Explanation Analysis
The explanations generated by our method were evaluated for their usefulness. Many explanations were found to be comprehensive, providing clear and logical connections that would be beneficial for researchers in understanding the proposed drug-symptom links.
Generalizability
To confirm that our method can be applied to other diseases, we tested it on Alzheimer’s disease and a specific type of amyotrophic lateral sclerosis (ALS). The results continued to show strong performance, indicating that our approach can extend beyond rare diseases to other conditions as well.
Discussion
Our findings suggest that the integration of knowledge graphs with AI can significantly enhance drug repurposing efforts, especially for rare diseases that lack extensive research. The ability to provide clear explanations for predictions bolsters trust in AI-driven decisions and supports researchers in validating their findings.
Limitations
Despite the successes of our approach, there are still challenges to address. We only used one explainability method, which may not cover all aspects of the data. Further research is needed to refine these explanations and adapt them to different scenarios.
Future Directions
Going forward, there's a need to explore additional methods for providing explanations and improving the consistency of results. Better understanding how the structure of knowledge graphs affects the quality of predictions can also strengthen the research. Collaboration between computational experts and practical researchers will be essential for the continued evolution of this work.
Conclusion
In conclusion, our study has shown that combining knowledge graphs with state-of-the-art AI techniques can lead to meaningful advancements in drug repurposing for rare diseases. The ability to generate understandable explanations for predictions is crucial, as it helps researchers form hypotheses that can be tested in the lab. This innovative approach promises to accelerate drug research and improve outcomes for individuals affected by rare conditions.
Title: Knowledge Graphs and Explainable AI for Drug Repurposing on Rare Diseases
Abstract: Artificial Intelligence (AI)-based drug repurposing is an emerging strategy to identify drug candidates to treat rare diseases. However, cutting-edge algorithms based on Deep Learning (DL) typically dont provide a human understandable explanation supporting their predictions. This is a problem because it hampers the biologists ability to decide which predictions are the most plausible drug candidates to test in costly lab experiments. In this study, we propose rd-explainer a novel AI drug repurposing method for rare diseases which obtains possible drug candidates together with human understandable explanations. The method is based on Graph Neural Network (GNN) technology and explanations were generated as semantic graphs using state-of-the-art eXplainable AI (XAI). The model learns features from current background knowledge on the target rare disease structured as a Knowledge Graph (KG), which integrates curated facts and their evidence on different biomedical entities such as symptoms, drugs, genes and ortholog genes. Our experiments demonstrate that our method has excellent performance that is superior to state-of-the-art models. We investigated the application of XAI on drug repurposing for rare diseases and we prove our method is capable of discovering plausible drug candidates based on testable explanations. The data and code are publicly available at https://github.com/PPerdomoQ/rare-disease-explainer. HighlightsO_LIWe demonstrated the use of graph-based explainable AI for drug repurposing on rare diseases to accelerate sound discovery of new therapies for this underrepresented group. C_LIO_LIWe developed rd-explainer for rare disease specific drug research for faster translation. It predicts drugs to treat symptoms/phenotypes, it is highly performant and novel candidates are plausible according to evidence in the scientific literature and clinical trials. Key is that it learns a GNN model that is trained on a knowledge graph built specifically for a rare disease. We provide rd-explainer code freely available for the community. C_LIO_LIrd-explainer is researcher-centric interpretable ML for hypothesis generation and lab-in-the-loop drug research. Explanations of predictions are semantic graphs in line with human reasoning. C_LIO_LIWe detected an effect of knowledge graph topology on explainability. This highlights the importance of knowledge representation for the drug repurposing task. C_LI
Authors: Nuria Queralt-Rosinach, P. Perdomo-Quinteiro, K. Wolstencroft, M. Roos
Last Update: 2024-10-17 00:00:00
Language: English
Source URL: https://www.biorxiv.org/content/10.1101/2024.10.17.618804
Source PDF: https://www.biorxiv.org/content/10.1101/2024.10.17.618804.full.pdf
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to biorxiv for use of its open access interoperability.