6: ImUnity: A generalizable VAE-GAN solution for multicenter MR image harmonization

Blog post written by: Wen-Yen Kuo
Based on: Stenzel Cackowski, Emmanuel L. Barbier, Michel Dojat, Thomas Christen, ImUnity: A generalizable VAE-GAN solution for multicenter MR image harmonization,
Medical Image Analysis, Volume 88, 2023, 102799, ISSN 1361-8415, https://doi.org/10.1016/j.media.2023.102799.
Post Date: 2024/07

TL;DR

ImUnity is a novel 2.5D deep learning model designed for harmonizing magnetic resonance (MR) images acquired from multiple centers. It aims to remove site-specific biases while preserving biological information, enabling more effective multicenter studies and improving the generalizability of machine learning models in medical imaging.

Introduction

Motivation

Magnetic Resonance (MR) imaging is a cornerstone in medical diagnostics due to its ability to produce detailed images of the body's internal structures. However, MR images from the same patient can vary significantly when acquired at different sites. This variability arises from the qualitative nature of MR data acquisition, which is influenced by factors such as hardware differences, sequence parameters, and scanner artifacts. These discrepancies present significant challenges in multi-center MR studies, as they introduce non-biological variance that undermines the statistical power of the studies and impedes the generalizability of machine learning models trained on single-site data.

Previous work

To address these challenges, several methods have been developed over the past decade to harmonize MR data from multiple sites or scanners. Traditional techniques such as standardization, global scaling, and intensity histogram matching have been used to mitigate site or scanner biases, but these approaches often remove valuable local intensity variations. More sophisticated statistical methods, including Ravel, ComBat, and dictionary learning, have shown promise in harmonizing diffusion MRIs and longitudinal structural sequences. However, these methods require adjustments for new sites or scanners and consistent clinical information across all database centers. Recent advancements in 2D deep-learning models, such as CycleGAN, Deep-Harmony, and Calamity, have demonstrated potential for structural MR image harmonization, but they have limitations, including the need for fine-tuning for each pair of sites and restrictions to two-site harmonization.

Method Description

Datasets

This study uses three open-source databases to evaluate the proposed harmonization model, ImUnity:

ABIDE: A multi-center project focused on Autism Spectrum Disorder (ASD), containing T1-weighted scans from 11 different sites and scanners, including 621 scans (309 patients and 312 controls).
OASIS: The Open Access Series of Imaging Studies, which includes T1-weighted scans from healthy and Alzheimer’s Disease (AD) adult subjects, with data from 4 different scanners.
SRPBS: A multi-site database gathering multi-disorder subjects, used to validate harmonization results between different acquisition sites.

Datasets	sites	scanners	patients	controls	total
ABIDE	11	3	309	312	621
OASIS	-	4	493	605	1098
SRPBS	6	12	9 health traveling subjects

Models

ImUnity is a novel 2.5D deep-learning model based on a VAE-GAN architecture. It includes a confusion module to reduce site/scanner biases and an optional biological preservation module to maintain essential clinical features. The model leverages multiple 2D slices from different anatomical locations and image contrast transformations during training to generate harmonized MR images suitable for multi-center studies.

Loss Functions

1. Discriminator Loss

Purpose: This loss function is used to train the discriminator to differentiate between real and generated (fake) images.
Function: Binary cross-entropy loss is used.

2. Confusion Loss

Purpose: To train the encoder to produce a latent space representation that does not contain site-specific information.
Function: Categorical cross-entropy loss.
Formula: $\begin{array}{l}l_{\text{confusion}}(P) = -\frac{1}{N} \sum_{i=1}^{N} \sum_{s=1}^{S} \frac{\log(p_{si})}{S}\end{array}$

3. Site/Scanner Unlearning Module Loss

Purpose: To predict the scan’s origin based on the latent space representation, which helps in removing site/scanner biases.
Function: Categorical cross-entropy loss.
Formula: $\begin{array}{l}l_{\text{site}}(P, Y) = -\frac{1}{N} \sum_{i=1}^{N} \sum_{s=1}^{S} (y_i = s) \log(p_{si})\end{array}$

4. Biological Preservation Module Loss

Purpose: To preserve biological features such as age or disease status in the latent space representation.
Function: Binary cross-entropy loss.
Formula: $\begin{array}{l}l_{\text{biological}}(P, Y) = -\frac{1}{N} \sum_{i=1}^{N} \sum_{f \in \text{features}} \left[ y_{fi} \log(p_{fi}) + (1 - y_{fi}) \log(1 - p_{fi}) \right]\end{array}$

5. Generator Loss

Purpose: To train the generator to produce realistic and harmonized images.
Overall Formula: $\begin{array}{l}l_{\text{generator}} = - \lambda_1 l_{\text{discriminator}} + \lambda_2 l_{\text{confusion}} + \lambda_3 l_{\text{biological}} + \lambda_4 l_1 + \lambda_5 l_{\text{KL}}\end{array}$

6. L1 Loss (Reconstruction Loss)

Purpose: To ensure accurate reconstruction of input images by the generator.
Function: Mean absolute error (L1 loss).
Formula: $\begin{array}{l}l_1 = mean(| \hat{S_1^r} - S_1^r |)\end{array}$

7. Kullback-Leibler Divergence

Purpose: To ensure a dense and smooth latent space representation.
Function: Kullback-Leibler divergence.
Formula: $\begin{array}{l}l_{\text{KL}}(P, Q) = \sum_{i} P(i) \log \frac{P(i)}{Q(i)}\end{array}$

2.5D Approach

The 2.5D approach addresses the intrinsic problem of 2D generative models, which often produce discontinuous outputs along the third axis. This method combines the outputs of three models, each trained along a specific axis, to produce higher-quality harmonized images. By fusing predictions using the median value, this approach mitigates artifacts and maintains a manageable number of parameters, thus not requiring extensive additional training datasets.

Experiments

The efficiency and flexibility of ImUnity were tested using the three aforementioned databases. The experiments aimed to evaluate the quality of reconstructed images, the capacity to remove site or scanner bias, and the ability to classify patients after data harmonization.

Harmonization on Traveling Subjects (OASIS+SRPBS)

The first experiment evaluated ImUnity's ability to transform images from one domain to their equivalent in another domain. Ground truth data from traveling subjects in the OASIS and SRPBS databases were used for validation. One domain was selected as the reference, and all images were transformed by the model into this reference domain. The results were compared to the ground truth images acquired in the reference domain, assessing the performance using visual verification, image intensity histograms, and the Structural Similarity Index Metric (SSIM).

(Cackowski et al., 2023, Figure 3 and Table 2)

Harmonization's Effects on Site Classification (ABIDE)

The second experiment assessed ImUnity's ability to remove site-specific information by evaluating the performance of a classification algorithm on radiomic features extracted from the ABIDE data before and after harmonization. A standard Support Vector Machine (SVM) with a radial basis function kernel was used to classify the data, with accuracy and the Area Under Curve (AUC) of the Receiver Operating Characteristic (ROC) curve used as metrics to evaluate the classifier's performance.

(Cackowski et al., 2023, Figure 4)

Harmonization's Effects on Autism Syndrome Disorder Prediction (ABIDE)

The third experiment evaluated ImUnity's impact on the classification of Autism Spectrum Disorder (ASD) patients using the ABIDE dataset. A 10-fold cross-validation procedure was employed to test the classifier's performance before and after harmonization. The model's ability to improve classification accuracy across different combinations of sites included in the database was examined, demonstrating the flexibility and robustness of ImUnity in enhancing patient classification outcomes.

(Cackowski et al., 2023, Figure 5)

Conclusion

The ImUnity model, a novel 2.5D deep-learning framework based on a VAE-GAN architecture, demonstrates significant advancements in harmonizing MR images from multiple centers. By addressing the variability introduced by different acquisition sites, scanners, and protocols, ImUnity effectively reduces non-biological variance, enhancing the statistical power and generalizability of multi-center MR studies.

Key Findings

Effective Harmonization: ImUnity successfully harmonizes MR images from different sites, producing consistent image quality and intensity levels, while preserving critical anatomical and clinical features.
Improved Classification Performance: The model improves the accuracy and robustness of machine learning models trained on harmonized data, particularly in tasks such as site classification and Autism Spectrum Disorder (ASD) prediction.
Flexibility and Robustness: The 2.5D approach and the use of multiple 2D slices from different anatomical locations allow the model to generate high-quality harmonized images without extensive additional training datasets.

Appendix

Prompts and Tools for writing the post

ChatGPT

introduce under these structure <some structure>
use two sentence to describe the paper
The latex format for each loss function
Does the pdf explain the paper clear, is there anything to add?

Gemini

introduce under these structure <some structure>
use two sentence to describe the paper

Seitenhierarchie