Advancements in Detecting Gene Fusions in Cancer
New techniques improve detection and analysis of gene fusions linked to cancer.
― 5 min read
Table of Contents
- How Gene Fusions Lead to Cancer
- Advancements in Fusion Gene Detection
- Limitations of Short Reads in Detection
- Long-Read Sequencing Technologies
- CTAT-LR-Fusion: A New Tool for Detecting Fusions
- Evaluating the Performance of CTAT-LR-Fusion
- Applications in Single-Cell Transcriptomics
- Challenges and Future Directions
- Conclusion
- Original Source
- Reference Links
Cancer cells often have changes called genomic rearrangements that can lead to the formation of fusion genes. These are new genes that arise when parts of two different genes join together. Sometimes, these changes can activate Oncogenes, which help cancer grow, or disable Tumor Suppressor Genes that normally help prevent cancer.
Among various types of cancers, some specific fusion genes are known to play a major role. For example, the BCR::ABL1 fusion gene is commonly found in chronic myelogenous leukemia (CML). Other notable examples include SS18::SSX in synovial sarcoma and TMPRSS2::ERG in prostate cancer. Understanding these fusion genes is vital for diagnosing certain cancers, especially in children, where specific fusions serve as markers.
How Gene Fusions Lead to Cancer
The way gene fusions contribute to cancer can vary. One common mechanism is when the fused gene's new position affects how genes are regulated, allowing them to be expressed inappropriately. Another way is when the fusion creates a new protein with altered function. These changes can disrupt normal cell behavior and lead to uncontrolled cell growth, which is characteristic of cancer.
Identifying these gene fusions has become an important part of cancer research. It helps in discovering biomarkers for diagnosis and in developing targeted therapies, like tyrosine kinase inhibitors that are used to treat CML.
Advancements in Fusion Gene Detection
In recent years, a technique called RNA Sequencing (RNA-seq) has gained popularity for detecting gene fusions. RNA-seq is preferred because it is generally cheaper and allows researchers to measure the actual RNA products resulting from gene fusions. The Illumina platform for RNA-seq has become common in cancer research.
Many computational methods have been developed to analyze data from Illumina RNA-seq to identify potential fusion genes. These methods have led to significant findings about fusion genes in both cancer and normal tissues. While fusions in cancer are often due to genomic rearrangements, those found in normal tissues are typically caused by processes like splicing or natural genetic variations.
Limitations of Short Reads in Detection
Despite the advancements with RNA-seq, the short length of the RNA reads can limit their effectiveness. Short reads may not capture the full structure of the fusion transcripts, necessitating additional methods to reconstruct the complete sequences.
Moreover, standard RNA-seq methods that focus only on certain parts of the RNA molecules can miss important information about the fusion breakpoints. This is where new technologies like long-read sequencing come into play.
Long-Read Sequencing Technologies
Recent advancements in long-read sequencing technologies, such as those offered by PacBio and Oxford Nanopore, enable the sequencing of entire RNA molecules. This opens up new possibilities for science by allowing complete sequencing of fusion genes and their transcripts.
Initially, these long-read technologies faced challenges, including low throughput and higher error rates. However, improvements in both technologies have led to more accurate results, making them suitable for detailed investigation of gene fusions, especially in cancer research.
CTAT-LR-Fusion: A New Tool for Detecting Fusions
To address the need for improved fusion detection from long reads, a tool called CTAT-LR-fusion was developed. This tool is part of a larger framework that helps analyze cancer transcriptomes. CTAT-LR-fusion is designed specifically for long-read RNA-seq and provides various functions including detecting chimeric reads, identifying fusion transcripts, quantifying their expression, and visualizing results.
To ensure it works effectively, researchers created comprehensive datasets that mimic different sequencing technologies and error rates. They also ran experiments using cancer cell lines to validate the tool's accuracy.
Evaluating the Performance of CTAT-LR-Fusion
The performance of CTAT-LR-fusion was benchmarked against existing tools using simulated and real datasets. It showed strong results in identifying fusion transcripts accurately, and it outperformed other methods in terms of reporting the correct order and orientation of fused genes.
When applied to actual cancer cell lines, CTAT-LR-fusion was able to detect more fusion transcripts than traditional short-read RNA-seq methods. The findings illustrated that long reads allow for a deeper understanding of the complexity and variety of fusions present in cancers.
Applications in Single-Cell Transcriptomics
A significant area of research is how gene fusions behave at the single-cell level. The ability to analyze individual cells provides insights into tumor heterogeneity and the biological impact of fusions. Using CTAT-LR-fusion, researchers analyzed tumor samples and identified various fusion genes.
In a T-cell infiltrated melanoma sample, the tool helped identify a specific fusion gene that was present in a high percentage of cancer cells. This highlights the potential of long-read sequencing and CTAT-LR-fusion in uncovering important genetic information related to cancer.
Challenges and Future Directions
As research continues, several challenges remain. The need for more extensive validation of fusion detections in diverse cancer types and the integration of data from both long and short-read sequencing platforms is critical.
With the rapid development of sequencing technologies, the focus will be on optimizing the methods for detecting and analyzing gene fusions further. This includes improving the sensitivity and specificity of detection tools, and exploring the full range of biological implications of fusions in various cancers.
Conclusion
The study of fusion genes in cancer is a dynamic field that is continually evolving. New technologies and tools like CTAT-LR-fusion enhance our ability to identify and analyze these critical genetic changes. As we learn more about the role of gene fusions in cancer, we move closer to personalized medicine approaches that can lead to better outcomes for patients.
The promise of integrating long-read sequencing into cancer research holds great potential for uncovering new biomarkers and developing more effective therapies. As we continue to explore these advancements, the future of cancer diagnosis and treatment could be transformed.
Title: CTAT-LR-fusion: accurate fusion transcript identification from long and short read isoform sequencing at bulk or single cell resolution
Abstract: Gene fusions are found as cancer drivers in diverse adult and pediatric cancers. Accurate detection of fusion transcripts is essential in cancer clinical diagnostics, prognostics, and for guiding therapeutic development. Most currently available methods for fusion transcript detection are compatible with Illumina RNA-seq involving highly accurate short read sequences. Recent advances in long read isoform sequencing enable the detection of fusion transcripts at unprecedented resolution in bulk and single cell samples. Here we developed a new computational tool CTAT-LR-fusion to detect fusion transcripts from long read RNA-seq with or without companion short reads, with applications to bulk or single cell transcriptomes. We demonstrate that CTAT-LR-fusion exceeds fusion detection accuracy of alternative methods as benchmarked with simulated and real long read RNA-seq. Using short and long read RNA-seq, we further apply CTAT-LR-fusion to bulk transcriptomes of nine tumor cell lines, and to tumor single cells derived from a melanoma sample and three metastatic high grade serous ovarian carcinoma samples. In both bulk and in single cell RNA-seq, long isoform reads yielded higher sensitivity for fusion detection than short reads with notable exceptions. By combining short and long reads in CTAT-LR-fusion, we are able to further maximize detection of fusion splicing isoforms and fusion-expressing tumor cells. CTAT-LR-fusion is available at https://github.com/TrinityCTAT/CTAT-LR-fusion/wiki.
Authors: Brian J. Haas, Q. Qin, V. Popic, H. Yu, E. White, A. Khorgade, A. Shin, K. Wienand, A. Dondi, N. Beerenwinkel, F. Vazquez, A. M. AlKhafaji
Last Update: 2024-02-28 00:00:00
Language: English
Source URL: https://www.biorxiv.org/content/10.1101/2024.02.24.581862
Source PDF: https://www.biorxiv.org/content/10.1101/2024.02.24.581862.full.pdf
Licence: https://creativecommons.org/licenses/by/4.0/
Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.
Thank you to biorxiv for use of its open access interoperability.
Reference Links
- https://github.com/TrinityCTAT/CTAT-LR-fusion/wiki
- https://github.com/TrinityCTAT/ctat-minimap2
- https://github.com/TrinityCTAT/CTAT-LR-fusion
- https://github.com/Oshlack/JAFFA
- https://github.com/PacificBiosciences/pbfusion/releases
- https://github.com/Maggi-Chen/FusionSeeker
- https://github.com/WGLab/LongGF
- https://github.com/broadinstitute/CTAT-LRF-Paper/tree/main/0.Workflows_and_Dockers
- https://data.broadinstitute.org/Trinity/CTAT_FUSIONTRANS_BENCHMARKING/on_simulated_data/simulated_fusion_transcript_sequences/
- https://ndownloader.figshare.com/files/27676470
- https://github.com/PacificBiosciences/ccs
- https://github.com/yukiteruono/pbsim3/issues/12
- https://github.com/MethodsDev/pbsim3
- https://github.com/fusiontranscripts/LR-FusionBenchmarking
- https://github.com/broadinstitute/CTAT-LRF-Paper
- https://zenodo.org/records/10650516