Simple Science

Cutting edge science explained simply

# Electrical Engineering and Systems Science# Sound# Audio and Speech Processing

Audio Analysis in COVID-19 Detection

Using audio signals to identify respiratory health risks.

― 7 min read


Audio-Based COVID-19Audio-Based COVID-19Detectionconditions.Leveraging sound to assess respiratory
Table of Contents

The COVID-19 pandemic has pushed scientists and researchers to find new ways to detect and manage respiratory illnesses. One focus has been on using Audio signals like coughs and breaths to identify signs of respiratory problems, including COVID-19. The smarty4covid project was created to gather audio recordings and other relevant data to help build models that can assess COVID-19 risk.

The Smarty4covid Dataset

The smarty4covid dataset includes a large collection of audio recordings, specifically 4,676 coughs, 4,665 regular breathing sounds, 4,695 deep breathing sounds, and 4,291 voice samples. These recordings were collected from people using their mobile devices as part of a crowd-sourcing effort. In addition to audio data, the dataset contains self-reported information such as COVID-19 test results and health history. This makes it a valuable resource for developing models to detect COVID-19 risk.

The dataset is organized as a web-ontology language (OWL) knowledge base. This structure allows researchers to combine data from other sources and perform complex queries. The smarty4covid dataset has been used to create models that identify specific segments in audio recordings, like coughs and breaths, and extract key health indicators from breathing sounds.

Background

The COVID-19 pandemic has led to many technological advancements, including vaccines and mobile applications aimed at managing the health crisis. Mobile health technologies have played a significant role in combating COVID-19 by raising awareness, collecting health data, providing telemedicine services, and supporting healthcare professionals in their decision-making processes.

Artificial Intelligence (AI) and Machine Learning (ML) have also been integral in addressing COVID-19 challenges. They have accelerated research, improved diagnosis, and facilitated epidemiological modeling based on various data sources, including audio recordings.

The urgency for quick diagnosis during the pandemic has highlighted some limitations of existing testing methods. For example, RT-PCR tests can be slow and require specialized equipment and trained personnel. Antigen tests are quicker but often less sensitive. Therefore, there is a growing need for a fast and affordable approach to detect COVID-19, and the smarty4covid project aims to fulfill this need by analyzing audio recordings.

Crowd-Sourcing Audio Data

Several projects have worked to collect audio data from the public since the start of the pandemic. The COVID-19 Sounds project was one of the first initiatives, leading to the compilation of a large database of audio samples. Other databases, like Coswara and Coughvid, were established to gather various types of audio related to respiratory health.

Despite the promise of crowd-sourced data, there are challenges. Many recordings can be low quality or contain unrelated sounds. This underlines the importance of cleaning and curating the data. Projects like Coughvid and COVID-19 Sounds have developed methods to filter out noise and irrelevant segments from recordings.

Data Curation Process

The crowd-sourced recordings often contain invalid submissions due to poor audio quality or user mistakes. To address this, the smarty4covid project uses a two-step data curation process. First, they clean the dataset by removing erroneous audio samples. Second, trained professionals, like doctors and specialists, label the recordings based on the presence of detectable health issues.

AI tools can help in this process by assessing audio quality. For instance, a model known as YAMNet has been utilized to accurately identify and filter out recordings that do not meet quality standards. This step is crucial to ensure the reliability of the data used in developing detection models.

Extracting Breathing Features

Breathing patterns can reveal vital information about a person's health. The smarty4covid project aims to extract clinically relevant indicators from audio recordings of breathing. Key indicators include respiratory rate (RR), the inhalation to exhalation ratio (I/E ratio), and fractional inspiration time (FIT).

The breathing feature extraction process has two parts. First, the system identifies segments in the audio that contain breathing sounds. Second, it distinguishes between inhalation and exhalation phases. This helps in calculating the respiratory indicators that are important in assessing respiratory health.

Classifying Audio Segments

To streamline the analysis, AI models have been developed to classify audio segments into categories like cough, breath, and voice. These models not only validate the quality of the smarty4covid dataset but also assist in identifying whether new audio submissions are valid for analysis.

The classification models utilize a type of AI known as Convolutional Neural Networks (CNN). These networks analyze the audio's Mel spectrograms, converting audio signals into visual representations that can be processed more easily by the AI. After training, the models can evaluate incoming audio segments and provide feedback on whether they correspond to coughs, voice, or breathing.

Addressing Potential Biases

While developing AI models, it's essential to recognize that biases may exist within the Datasets. For instance, if a dataset contains more recordings from one gender than another, it may skew the AI's predictions. The smarty4covid project seeks to identify and understand these biases.

To explore potential biases, the project employs a methodology that provides counterfactual explanations. This means that researchers can examine how changing certain factors (like gender or symptom reporting) would affect the model's predictions. By understanding these biases, researchers can improve the models and make them fairer.

Knowledge Base Development

The smarty4covid project also developed an OWL knowledge base. This allows for better organization of the collected data and the integration of various databases. The OWL knowledge base is designed to enable complex queries and reasoning, helping to identify users with specific health characteristics or needs.

The knowledge base contains detailed concepts about audio recordings, user demographics, health conditions, and symptoms. This structured approach allows for clearer insights into the data collected and helps researchers make informed decisions.

Technical Validation and Results

To ensure the dataset is representative, an analysis was conducted on the demographics, symptoms, and COVID-19 prevalence among participants. Most of the users were aged between 30 to 59, which reflects a group that is comfortable using mobile technology. The dataset also included various underlying health conditions, such as hypertension, which can affect the progression of COVID-19.

In addition, the project validated the quality of the smarty4covid dataset by training AI models to classify audio types. This validation process confirmed that the dataset could support the development of models capable of accurately detecting respiratory issues.

Performance of AI Models

The AI classifiers developed from the smarty4covid dataset were tested against external datasets to assess their performance. Two models, one analyzing short audio segments and another analyzing longer segments, were compared. The longer segment classifier performed slightly better, showcasing the effectiveness of analyzing audio across different time scales.

To further improve performance, a multi-scale classifier was created that combines the strengths of both models. This new model outperformed previous classifiers, demonstrating higher accuracy in detecting coughs and breaths.

Conclusion

The smarty4covid project showcases the potential for audio analysis in detecting respiratory issues and understanding COVID-19 risks. By leveraging crowd-sourced data, AI technologies, and structured knowledge bases, significant strides have been made in developing effective models for health assessment.

As more audio data is collected and analyzed, the accuracy and reliability of these models will continue to improve. This work is crucial not only in the context of the ongoing pandemic but also for managing respiratory health in general. Through responsible and transparent AI practices, the healthcare community can enhance its ability to respond to respiratory health challenges effectively.

Future Directions

Looking ahead, the smarty4covid project aims to expand its dataset and improve the AI models further. Continuous monitoring of biases and performance will be crucial in refining these tools for clinical use. Future efforts may include incorporating more diverse populations, enhancing data collection methods, and exploring new audio features that can provide additional insights into respiratory health.

The ultimate goal is to establish a reliable, accessible, and efficient means of assessing COVID-19 risk and other respiratory conditions using simple audio recordings. This approach could revolutionize how respiratory illnesses are diagnosed and monitored, making healthcare more proactive and personalized.

Original Source

Title: The smarty4covid dataset and knowledge base: a framework enabling interpretable analysis of audio signals

Abstract: Harnessing the power of Artificial Intelligence (AI) and m-health towards detecting new bio-markers indicative of the onset and progress of respiratory abnormalities/conditions has greatly attracted the scientific and research interest especially during COVID-19 pandemic. The smarty4covid dataset contains audio signals of cough (4,676), regular breathing (4,665), deep breathing (4,695) and voice (4,291) as recorded by means of mobile devices following a crowd-sourcing approach. Other self reported information is also included (e.g. COVID-19 virus tests), thus providing a comprehensive dataset for the development of COVID-19 risk detection models. The smarty4covid dataset is released in the form of a web-ontology language (OWL) knowledge base enabling data consolidation from other relevant datasets, complex queries and reasoning. It has been utilized towards the development of models able to: (i) extract clinically informative respiratory indicators from regular breathing records, and (ii) identify cough, breath and voice segments in crowd-sourced audio recordings. A new framework utilizing the smarty4covid OWL knowledge base towards generating counterfactual explanations in opaque AI-based COVID-19 risk detection models is proposed and validated.

Authors: Konstantia Zarkogianni, Edmund Dervakos, George Filandrianos, Theofanis Ganitidis, Vasiliki Gkatzou, Aikaterini Sakagianni, Raghu Raghavendra, C. L. Max Nikias, Giorgos Stamou, Konstantina S. Nikita

Last Update: 2023-07-11 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2307.05096

Source PDF: https://arxiv.org/pdf/2307.05096

Licence: https://creativecommons.org/licenses/by-nc-sa/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles