Simple Science

Cutting edge science explained simply

# Quantitative Biology# Computer Vision and Pattern Recognition# Artificial Intelligence# Quantitative Methods

Advancing Breast Cancer Diagnosis with Deep Learning

This study uses deep learning and transfer learning for HER2 scoring in breast cancer.

― 6 min read


Deep Learning for HER2Deep Learning for HER2Scoringeffective models.Automating breast cancer diagnosis with
Table of Contents

When doctors think a patient might have breast cancer, they look at tissue samples under a microscope. These samples are usually stained to make the cells easier to see. Two common staining methods are Hematoxylin and Eosin (H E) and Immunohistochemistry (IHC). IHC is especially important because it helps doctors determine if a patient can receive targeted treatments. There's a lot of interest in using computers and deep learning to automatically read these slides so that doctors can spend less time squinting at tiny details.

The issue, however, is that teaching computers to look at medical images isn’t as straightforward as it sounds. To train a computer model effectively, we need a lot of labeled images. That's where Transfer Learning comes in. This method allows us to use what the computer has learned from one set of images to help it understand another set.

What is Transfer Learning?

Imagine you are trying to teach a child how to recognize different fruits. If the child already knows what an apple looks like, they can use that knowledge to learn what a peach looks like more quickly. In the same way, transfer learning uses knowledge from one area (like IHC images) to help with another (like H E images). This approach can save time, especially when working with medical data, which can be rare and hard to find.

Why Multiple-Instance Learning?

Sometimes, we don’t have detailed notes (or labels) for every image. That’s where multiple-instance learning (MIL) steps in. Think of it like a scavenger hunt. If you have a bag full of items, as long as you know at least one item in the bag is what you’re looking for, you can guess that the bag might be useful. Similarly, with MIL, if at least one patch is labeled positively, the whole group of patches can be treated as positive. This makes it easier to work with images where we don’t have every detail.

The Study

In this study, we wanted to see how transfer learning could help deep learning models score HER2, a crucial marker for breast cancer. We took three different types of images for our research:

  1. H E images: These are the stained images used for examining tissues.
  2. IHC images: These images provide specific information about HER2.
  3. Non-medical images: Think of random pictures, like cats and landscapes.

We examined how models trained on each of these different types of images performed. With a focus on HER2 scoring, we also built a model that can draw attention to specific areas within the slides that are important for diagnosis.

The Methodology

We started by grabbing tiny pieces, called patches, from the whole slide images. These patches were taken from both H E and IHC stained slides. To make our training data more varied and robust, we played around with the patches, changing their brightness, color, and minor rotations.

Using a pre-trained model, we transformed these patches into a format that our computer could understand, creating a new layer for attention. This attention layer helps the model focus on the important parts of the images. Think of it as putting on a pair of glasses that makes the details pop.

Getting Down to Business

Once we set everything up, we trained our models. We created multiple bags of patches, ensuring that no bag was reused during training. This was to make sure we were covering all possible variations of the patches.

After training, we split our data into two groups: one for training and one for testing. We wanted to see how well our model could perform on new data it hadn't seen before. This is like baking a cake using a recipe for the first time and then seeing how it holds up when you serve it to your friends.

Results

We found that when we used patches from H E images for training, the model performed better overall compared to the others. However, when we used patches from the PatchCamelyon dataset, it outshined the rest across all measures of success.

We wanted to know how well our model could predict HER2 scores on whole slides. We used a method similar to simulating a game multiple times to get a better understanding of the overall score. By repeatedly sampling and predicting, we improved the accuracy of our final results.

Not only did we want to know how the model scored, but we also wanted to see where it was looking. By using the attention mechanism, we could create a heatmap showing which areas of the slide were important for the model's prediction. It was like having a flashlight on the spots that mattered most.

Visualizing the Results

To show off our findings, we created some heatmaps based on the data. These heatmaps highlighted areas that were suspected to be HER2 positive. Imagine a treasure map, but instead of gold, it shows where the important cancer markers are hiding in the tissue.

During testing, we noticed that as we increased the number of sampled patches, the model became more confident in its predictions. More samples led to better results, which means that if we keep practicing, we’ll only get better.

Conclusions and Future Plans

In summary, we successfully built a model for automatic HER2 scoring using H E images. The transfer learning from H E to H E was more effective than using IHC or non-medical images. This study shows promise for using MIL when detailed annotations are lacking.

As for future plans, there’s still work to be done. We hope to fine-tune our models and explore more strategies to enhance their performance. If we can figure out the best ways to use the various data sources, we could unlock new ways to improve medical image analysis, one slide at a time.

In the end, while we may not be able to crack the cure for cancer just yet, we’re certainly on the right path to making diagnosis a whole lot easier, one pixel at a time. Who knew that helping doctors could start with just a bag of patches and a sprinkle of computer science?

Original Source

Title: Leveraging Transfer Learning and Multiple Instance Learning for HER2 Automatic Scoring of H\&E Whole Slide Images

Abstract: Expression of human epidermal growth factor receptor 2 (HER2) is an important biomarker in breast cancer patients who can benefit from cost-effective automatic Hematoxylin and Eosin (H\&E) HER2 scoring. However, developing such scoring models requires large pixel-level annotated datasets. Transfer learning allows prior knowledge from different datasets to be reused while multiple-instance learning (MIL) allows the lack of detailed annotations to be mitigated. The aim of this work is to examine the potential of transfer learning on the performance of deep learning models pre-trained on (i) Immunohistochemistry (IHC) images, (ii) H\&E images and (iii) non-medical images. A MIL framework with an attention mechanism is developed using pre-trained models as patch-embedding models. It was found that embedding models pre-trained on H\&E images consistently outperformed the others, resulting in an average AUC-ROC value of $0.622$ across the 4 HER2 scores ($0.59-0.80$ per HER2 score). Furthermore, it was found that using multiple-instance learning with an attention layer not only allows for good classification results to be achieved, but it can also help with producing visual indication of HER2-positive areas in the H\&E slide image by utilising the patch-wise attention weights.

Authors: Rawan S. Abdulsadig, Bryan M. Williams, Nikolay Burlutskiy

Last Update: 2024-11-05 00:00:00

Language: English

Source URL: https://arxiv.org/abs/2411.05028

Source PDF: https://arxiv.org/pdf/2411.05028

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to arxiv for use of its open access interoperability.

More from authors

Similar Articles