Simple Science

Cutting edge science explained simply

# Electrical Engineering and Systems Science# Image and Video Processing

Preparing Medical Images for Stroke Analysis

A look at the process for preparing CT scans for deep learning.

― 5 min read


CT Scan Preparation forCT Scan Preparation forStroke Diagnosisimaging for stroke treatment.Standardized methods enhance medical
Table of Contents

Creating strong medical image datasets is crucial for developing advanced software that helps in stroke treatment. However, there are various challenges involved in this process. Despite the abundance of brain CT scans produced in hospitals, many of these images are not suitable for training deep learning models. This is due to issues such as poor image quality and a lack of access to a wide range of data.

The Need for Quality Datasets

Deep learning methods have become popular for analyzing medical images, but they require large amounts of data to work effectively. However, many of these datasets are not publicly available due to patient privacy concerns. This results in a limited number of small, well-curated datasets that may not represent the actual diversity seen in everyday clinical practice. Ideally, deep learning methods should work with any routine medical images captured in hospitals. Unfortunately, extra work is often needed to prepare these images before they can be used for analysis.

Common Challenges in Image Preparation

Preparing medical images entails various challenges. Some common issues include:

  1. Different Image Orientations: Images can be taken from various angles, such as axial (top-down), sagittal (side), and coronal (front). This variety can complicate the analysis.

  2. Types of Images: Scans may show different types of tissues, like soft tissues or bones, which may not always be beneficial for stroke analysis.

  3. Size Variations: Images can come in various sizes and dimensions, making it difficult to standardize them for deep learning use.

  4. Background Noise: Many scans contain unnecessary background, which can interfere with the analysis of the actual medical content.

The Data Preparation Process

To tackle these challenges, a systematic process was developed to prepare brain CT scans for deep learning analysis. The goal was to create a standardized dataset from images collected during a significant clinical trial involving stroke patients. Below are the steps taken to prepare the data:

1. Identifying Axial Images

The first step is to determine which scans are in the correct axial orientation. This is done by checking metadata associated with each image. Images that are not in the correct orientation may lead to errors during analysis.

2. Converting Data Formats

Images are often stored in DICOM format, a standard for medical imaging. For easier analysis in deep learning projects, these images are converted to a different format known as NIfTI. This conversion process must ensure that no important details are lost.

3. Removing Unnecessary Scans

Certain images, known as localisers, are taken to position the patient's head in the scanner. However, these images do not show useful brain tissue, so they are excluded from the dataset. Scans that are separated into multiple parts for technical reasons are also removed.

4. Excluding Bone Reformats

Images may be edited to enhance bone details, making them less effective for stroke diagnosis. These bone-focused scans are identified and removed from the dataset.

5. Registering Images

To analyze brain lesions effectively, all scans must be aligned to a common reference. This is achieved through a process called registration, which aligns the CT scans with a standard MRI template. This step ensures that brain regions are represented consistently.

6. Cropping Background

To maintain focus on the brain, the excess background is cropped from the images on a pixel-by-pixel basis. This helps enhance the quality of the data set by removing unnecessary elements.

7. Padding and Resizing

Deep learning models typically require images to be the same size. Therefore, each scan is either resized or padded with zeros to fit a predetermined dimension of 500x400 pixels.

8. Scaling Brightness

Finally, the brightness levels of the images, which indicate different types of tissues, need to be uniform. This is done by scaling the brightness values to a consistent range, making the images easier for algorithms to interpret.

Data Loss and Processing Time

Out of thousands of patients who participated in the clinical trial, a significant portion had usable CT scans. However, many scans were rejected due to various issues, such as incorrect orientation or poor image quality. After completing the preparation process, a substantial number of scans remained suitable for analysis.

The average time to process a single scan was around two minutes, but this could vary significantly based on factors like the number of slices in the scan and the patient's position during scanning. Overall, developing this data preparation system required considerable effort, with hundreds of workdays spent refining the process.

Importance of Standardization

The ultimate aim of this entire process is to create a standardized method for preparing medical images for deep learning analysis. This is vital as it bridges the gap between raw clinical data and the refined datasets needed for effective machine learning model training.

By sharing this preparation pipeline openly, researchers working on similar projects can benefit and potentially improve their own processes. The hope is that standardized methods will lead to better training of deep learning models, which in turn can enhance stroke diagnosis and treatment.

Conclusion

The creation of medical image datasets for deep learning involves navigating various challenges, from managing different image types to ensuring data quality. A systematic and standardized approach to preparing these images is essential for developing effective software that can aid in the treatment of stroke patients. By making these processes transparent and accessible, the medical community can foster innovation in healthcare technology, ultimately improving patient outcomes.

Original Source

Title: Challenges of building medical image datasets for development of deep learning software in stroke

Abstract: Despite the large amount of brain CT data generated in clinical practice, the availability of CT datasets for deep learning (DL) research is currently limited. Furthermore, the data can be insufficiently or improperly prepared for machine learning and thus lead to spurious and irreproducible analyses. This lack of access to comprehensive and diverse datasets poses a significant challenge for the development of DL algorithms. In this work, we propose a complete semi-automatic pipeline to address the challenges of preparing a clinical brain CT dataset for DL analysis and describe the process of standardising this heterogeneous dataset. Challenges include handling image sets with different orientations (axial, sagittal, coronal), different image types (to view soft tissues or bones) and dimensions, and removing redundant background. The final pipeline was able to process 5,868/10,659 (45%) CT image datasets. Reasons for rejection include non-axial data (n=1,920), bone reformats (n=687), separated skull base/vault images (n=1,226), and registration failures (n=465). Further format adjustments, including image cropping, resizing and scaling are also needed for DL processing. Of the axial scans that were not localisers, bone reformats or split brains, 5,868/6,333 (93%) were accepted, while the remaining 465 failed the registration process. Appropriate preparation of medical imaging datasets for DL is a costly and time-intensive process.

Authors: Alessandro Fontanella, Wenwen Li, Grant Mair, Antreas Antoniou, Eleanor Platt, Chloe Martin, Paul Armitage, Emanuele Trucco, Joanna Wardlaw, Amos Storkey

Last Update: 2023-09-26 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2309.15081

Source PDF: https://arxiv.org/pdf/2309.15081

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles