Simple Science

Cutting edge science explained simply

# Biology# Neuroscience

Advancing Neuroimaging with Federated Learning Techniques

New methods improve communication and privacy in neuroimaging research using federated learning.

― 9 min read


Federated Learning inFederated Learning inNeuroimagingmaintaining privacy.Efficient data training while
Table of Contents

Deep learning has made big changes in areas like seeing with computers and processing language. Recently, it is starting to make waves in the field of studying the brain using images, known as neuroimaging. As deep learning models get bigger and more complex, it becomes important to share and train them in a way that keeps sensitive data safe, especially since this data is often located far apart in many different places.

Collaborative analysis of MRI data can provide valuable insights and help researchers look at information beyond what one study might originally gather. MRI scans are usually kept for a long time, which means that a lot of data builds up at various research sites. With advancements in technology, data is getting more complex but also cheaper to manage. This encourages researchers to combine data from different teams to work with larger groups of samples and find important details while still keeping individual data private.

Training models on lots of data while keeping privacy is very important. However, bringing data from various sites into one central location for training could risk exposing sensitive information, leading to ethical issues. Federated Learning (FL) takes care of this by allowing different devices or organizations to train models without sharing the actual data.

In FL, a central server organizes training, and client sites only communicate model details instead of the data itself. In some setups, especially decentralized ones, there might not be a central server, and clients train a model together. But challenges come up because of differences in data between clients, limited communication speed, and costs of computing. This article focuses on improving Communication Efficiency in distributed federated neuroimaging systems by training simpler models at local sites.

What is Federated Learning?

Federated Learning stands out from traditional methods of distributed learning in several ways:

Non-Identical Data

The data clients work with is not the same across the board. Each local site might have data that doesn't represent the whole population accurately.

Unbalanced Data

The amount of data each client has can differ widely, causing an imbalance in representation.

Large Distribution

Often, there are more clients than samples per client, showing the wide spread of data.

Limited Communication

Communication occurs infrequently among clients or between clients and a central server, due to slow or expensive connections.

This work mainly aims to cut down communication costs when dealing with unbalanced and non-identical data. The method does this by finding a smaller network based on each local site’s data and only sharing the parameters of this smaller network during communication rounds. In each round, a group of clients is chosen, and federated training continues on this smaller group of clients.

The Federated Learning Optimization Problem

In a typical FL situation, a central server seeks to create a global statistical model by periodically talking to a set of clients. The federated averaging algorithm can work with any set objective function.

In a usual machine learning problem, the goal is to minimize the difference between predicted results and actual outcomes. We assume the data is divided among several clients. The FL framework helps address the issues arising from non-identical data distribution.

When creating a federated training method, several key factors need to be considered to maintain data privacy and deal with differences in client data and resource limits. Various works have aimed to handle non-identical data, but some studies suggest that the accuracy of FL with non-identical data tends to drop.

How Federated Learning Applies to Neuroimaging

In the last ten years, neuroimaging has seen significant growth in data sharing, open-source tools, and collaboration across many sites. This change is largely due to the high costs and time involved in collecting neuroimaging data. By combining data, researchers can find insights that reach beyond the original aims of individual studies. Sharing data helps strengthen research through larger sample sizes and confirmation of results, which is critical in neuroimaging studies.

Being able to increase sample sizes not only gives more dependable predictions but also strengthens the reliability and validity of research findings. It works to prevent data manipulation and fabrication. Furthermore, reusing data can significantly lower research costs.

Federated Learning is gaining recognition as an important method in healthcare and neuroimaging. In biomedical imaging, FL has been used for various tasks. These include segmenting whole brains from MRI scans, detecting brain tumors, classifying fMRI data from multiple sites, and finding biomarkers for diseases. Some platforms exist to enable focused and private distributed data processing in brain imaging, highlighting FL’s role in making healthcare data analysis more efficient while protecting privacy.

Improving Efficiency in Federated Learning

A primary goal in reducing model size is to find smaller networks within larger ones. This approach is appealing for several reasons, especially for real-time tasks on devices with limited resources, which are common in federated and collaborative learning situations. Making large networks smaller can significantly reduce the processing load.

Recently, a concept called the lottery ticket hypothesis has emerged. It suggests that smaller, effective networks exist within larger, more complex networks. These smaller networks can be trained separately to achieve results comparable to fully trained dense networks.

Pruning methods in deep learning generally fall into three categories: inducing sparsity before training, during training, or after training.

In the context of FL, using a lottery ticket method might not be efficient in terms of communication. Such methods often need costly pruning and retraining cycles. Few studies have focused on pruning in FL settings. Some have tried to introduce sparsity during training in FL, but many of these still face communication issues.

This work seeks to address these limitations through a new method.

Method Overview

In this section, we describe our proposed method for discovering smaller networks and training these simplified models effectively in a communication-efficient manner.

Discovering Sub-networks

Given a dataset at a local site, the training of a neural network can be expressed as minimizing certain risks. A sub-network within this larger network is a version that has fewer parameters.

The aim of finding these smaller networks at the start adds constraints by requiring all parameter iterations to be within a certain space. This means that the initial set of parameters should also follow these constraints consistently during training.

Importance Scoring

An effective way to determine which connections in a network are important involves an approach that looks at how changing each parameter affects the loss. This involves calculating which parameters have the most substantial influence and should be kept.

Once this is determined, the parameters with the highest scores are kept while others are removed. This process helps in creating an efficient, smaller network that can still perform well.

Proposed Method: Sparse Federated Learning for Neuroimaging (NeuroSFL)

We introduce a new method for efficient distributed sub-network discovery in the context of neuroimaging. The goal of NeuroSFL is to improve communication efficiency in decentralized federated learning scenarios, especially when dealing with non-identical data.

The process starts with a shared initial model at all local sites. Next, importance scores are calculated for each parameter based on the local imaging data. Each client then shares their scores to create a mask for the top parameters.

During federated training, clients work on their local data. At the end of training, they send their trained parameters back and those are averaged, marking the end of a communication round. The clients only share the weights related to the parameters that remain active based on their mask.

This results in better communication efficiency since only a small portion of the whole model is shared during the training process.

Iterative Importance Score Calculation

We tested an iterative version of the importance scoring method. This technique allows multiple rounds of assessing parameter importance, leading to potentially better models.

Dataset and Non-Identical Data

For our experiments, we used a specific dataset that focuses on brain development and child health. This dataset is large, with over 10,000 children involved. It includes various MRI scans and demographic details about the subjects.

To create non-identical data distributions among clients, we used a statistical method that helps distribute class labels unevenly among different clients.

Experimental Setup

We aimed to classify participant sex based on MRI scans by using a modified version of a well-known model. The model was optimized through a careful search for the right learning rate, batch size, and other important parameters.

Our experiments compared NeuroSFL against standard methods. These include methods where each client trains its model on its local data and then shares it for averaging.

In our research, we explored the effects of using local masks versus global masks, showing how different strategies impact performance.

Results and Discussion

Effect of Sparsity Levels

We first looked at how different levels of sparsity influenced model performance. Our approach performed well even as sparsity increased, maintaining high accuracy compared to traditional methods.

NeuroSFL showed strong results across various sparsity levels, outperforming other techniques especially at mid-level sparsity.

Performance on Non-Identical Data

Our model also maintained good performance while working with diverse data from several clients. The accuracy was consistent, suggesting that it can adapt well to different situations.

Iterative Importance Score Performance

We assessed several variations of the iterative importance scoring method to see how they affected performance. The results showed that increasing the number of iterations did not significantly improve accuracy compared to simpler methods.

Performance Efficiency in Real World

To test our method in a practical setting, we employed a federated learning system designed specifically for neuroimaging. Our method showed considerable speed improvements over traditional models in terms of communication time, demonstrating its effectiveness in real-world applications.

Conclusion

In summary, we have introduced a new approach to federated learning specifically for training sparse models in neuroimaging research. By focusing on creating smaller networks based on parameter importance, we achieved reduced communication costs and better efficiency in distributed training. Our method provides an effective way to work with neuroimaging data while keeping privacy intact and enhancing communication efficiency.

Original Source

Title: Efficient Federated Learning for distributed NeuroImaging Data

Abstract: Recent advancements in neuroimaging have led to greater data sharing among the scientific community. However, institutions frequently maintain control over their data, citing concerns related to research culture, privacy, and accountability. This creates a demand for innovative tools capable of analyzing amalgamated datasets without the need to transfer actual data between entities. To address this challenge, we propose a decentralized sparse federated learning (FL) strategy. This approach emphasizes local training of sparse models to facilitate efficient communication within such frameworks. By capitalizing on model sparsity and selectively sharing parameters between client sites during the training phase, our method significantly lowers communication overheads. This advantage becomes increasingly pronounced when dealing with larger models and accommodating the diverse resource capabilities of various sites. We demonstrate the effectiveness of our approach through the application to the Adolescent Brain Cognitive Development (ABCD) dataset.

Authors: Bishal Thapaliya, R. Ohib, E. P. T. Geenjaar, J. Liu, V. Calhoun, S. Plis

Last Update: 2024-05-15 00:00:00

Language: English

Source URL: https://www.biorxiv.org/content/10.1101/2024.05.14.594167

Source PDF: https://www.biorxiv.org/content/10.1101/2024.05.14.594167.full.pdf

Licence: https://creativecommons.org/licenses/by/4.0/

Changes: This summary was created with assistance from AI and may have inaccuracies. For accurate information, please refer to the original source documents linked here.

Thank you to biorxiv for use of its open access interoperability.

More from authors

Similar Articles