Simple Science

Cutting edge science explained simply

# Biology# Bioinformatics

Advancements in Plant Organelle Genome Assembly

New toolkit Oatk improves assembly of plant organelle genomes.

Richard Durbin, C. Zhou, M. Brown, M. Blaxter, The Darwin Tree of Life Project Consortium, S. A. McCarthy

― 5 min read


Plant Genome AssemblyPlant Genome AssemblyBreakthroughorganelle genomes.Oatk enhances understanding of plant
Table of Contents

Plant cells contain special parts called organelles that help them function. Two important organelles are Plastids and mitochondria. Plastids are known for their role in photosynthesis, which is how plants turn sunlight into energy. Mitochondria help with respiration, a process that converts energy from food into a form that plants can use. Both plastids and mitochondria have their own DNA, which has changed over a long time through evolution.

The DNA of Plastids and Mitochondria

The DNA in plastids, called plastomes, usually has a consistent size, typically ranging from 120 to 160 thousand base pairs. This DNA is shaped like a circle and has different regions that are important for its function. In contrast, mitochondrial DNA, known as mitogenomes, can vary widely in size. It can be as small as a few tens of thousands of base pairs to over ten million base pairs. Mitochondrial DNA can take different forms, such as circular or linear, and can have various structures, making it more diverse than plastid DNA.

Importance of Organelle DNA

The DNA from these organelles is not just important for the functions of the plants; it can also provide useful information about the diversity of plants and their evolutionary history. Scientists often use organelle DNA to study relationships between different plant species and to help identify them.

Techniques for Sequencing Organelle DNA

To study organelle DNA, scientists use different methods to sequence it. Initial methods involved Sanger sequencing, which is labor-intensive and expensive. More recent methods involve sequencing the entire genome from the DNA of cells. This newer approach is faster and more cost-effective but also comes with challenges, especially because of the repeated sequences found in most plant organelle DNA.

The Need for New Tools

Given these challenges, there is a need for specific software designed to assemble organelle Genomes from the data collected. Several tools are now available for this purpose, mainly designed to work with high-throughput sequencing data. These tools have some common steps, such as distinguishing between organelle DNA and nuclear DNA and putting together DNA fragments to form complete genomes.

The Development of Oatk

To address the problems faced with existing tools, a new toolkit called Oatk has been created. Oatk aims to efficiently assemble plastid and mitochondrial genomes from high-quality long-read sequencing data. It is designed to be user-friendly and fast. Oatk uses a few key techniques: a smart way to assemble genomes, a model to identify genes more accurately, and advanced methods to resolve complex assembly structures.

Using Oatk for Genome Assembly

Oatk was used to assemble organelle genomes for 195 different plant species. The assembly results were compared with other tools to evaluate Oatk's performance. The findings showed that Oatk performed better than other tools in many cases, successfully assembling genomes even in challenging situations.

Summary of Assembly Results

Out of the 195 plant species studied, Oatk was able to successfully assemble both plastid and mitochondrial genomes for all of them. Most of these genomes were circular in shape, although some were linear. The results showed a wide variety of genome sizes and structures among the different species.

Understanding Plastid Structures

Most assembled plastid genomes had a common structure that includes different regions necessary for their function. However, there were also examples of plastid genomes with variations, leading to diverse genome sizes within the same structural framework. This diversity was especially seen in different plant groups, such as grasses and mosses.

Insights into Mitochondrial Structures

The mitochondrial genomes displayed even greater diversity in size and structure than plastid genomes. They ranged from simple circular forms to complex arrangements with multiple circular components. Some species had unusual structures, such as linear components. This complexity highlighted the dynamic nature of mitogenomes across different plant species.

Heteroplasmy in Organelles

Heteroplasmy, where there are multiple different forms of organelle DNA within the same cell, was frequently observed. This indicates that plants can maintain more than one version of their organelle genomes, which can result from genetic changes over time. The existence of different forms of plastid and mitochondrial DNA within individual plants suggests a complex evolution and adaptation process.

Gene Transfers Between Organelles

Another interesting finding was the sharing of DNA sequences between plastid and mitochondrial genomes. This indicates that there may have been historical exchanges of genetic material between these organelles, further complicating their structures. Scientists found many shared DNA sequences, emphasizing how interconnected these two organelles may be.

Conclusion

In summary, the advances made by Oatk show great potential for studying plant organelle genomes more effectively. By efficiently assembling organelle DNA from high-quality long reads, Oatk can help reveal the vast diversity and complexity of plant genomes. Understanding these genomes is crucial for grasping how plants have evolved and how they function in their environments. The study of organelle genomes opens doors to better knowledge of plant biology, evolution, and biodiversity.

Original Source

Title: Oatk: a de novo assembly tool for complex plant organelle genomes

Abstract: Plant organelle genomes, particularly the large mitochondrial genomes with intricate repetitive structures, present significant challenges for assembly. The advent of long-read sequencing technologies provides a transformative opportunity to generate complete genomes, but problems of resolving alternative structures remain. Here we introduce a novel tool for plant organelle genome assembly from high-accuracy long reads. Our method employs a k-mer based assembler for rapid assembly graph construction, integrates a profile HMM gene database for robust organelle sequence annotation, and leverages a new search method to find the best supported path through the assembly graph. We describe high-quality organelle assemblies for 195 plant species and demonstrate improvements over other methods. The assembled genomes provide multiple insights into structural complexity, heteroplasmy, and DNA exchange between organelles.

Authors: Richard Durbin, C. Zhou, M. Brown, M. Blaxter, The Darwin Tree of Life Project Consortium, S. A. McCarthy

Last Update: 2024-10-28 00:00:00

Language: English

Source URL: https://www.biorxiv.org/content/10.1101/2024.10.23.619857

Source PDF: https://www.biorxiv.org/content/10.1101/2024.10.23.619857.full.pdf

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to biorxiv for use of its open access interoperability.

More from authors

Similar Articles