Introduction

Magnetic resonance imaging (MRI) is a non-ionizing imaging method. During the scanning, the scanned tissue absorbs the electromagnetic radiation pulse emitted by the device and then sending the RF signal out. The RF signal are received by the antenna, and thus imaging is performed. The original data obtained from the scan is presented in a Fourier space called K-space. Only through inverse Fourier transform (IFT) can an anatomically significant MRI scan image be generated. Traditional MRI lasts relatively long, which reduces patient comfort and is susceptible to motion artifacts. In situations where movement cannot be avoided, such as fetal MRI, heart and lung imaging, the scan time must be reduced. In addition, in dynamic MRI sequences, the time resolution of the sequence should be higher, such as dynamic contrast enhanced MRI (DCE-MRI).

In order to solve the above problems, there are several software solutions for accelerating MRI. The classic method is the CS-MRI [1], which is based on the "compressed sensing (CS)". When the sampled signal has sparsity in a certain transform domain, this method allows sampling at a frequency which is lower than the Nyquist frequency. In the CS-MRI method, only part of the K-space is sampled. Later, ADMM-CSnet was also proposed [2], which accelerated the CS process through a deep neural network. In general, the CS method greatly speeds up the scanning process, but may lose fine textures with anatomical significance.

The authors proposed a method based on U-net and GAN, and takes adjacent K-space slices as inputs to realize the reconstruction of the subsampled K-space and the following refinement process.

Methodology

Architecture

Figure 1: Framework architecture

The overall structure of the network is shown in the Figure 1. The network has two generators (G_k, G_{im}) and one discriminator. As shown in part a in the Figure 1, the input of the network is three adjacent scan results in space or time, that is, the original data of K-space. In addition, each scan result is assigned a "mask matrix" (M). This matrix simulates the process of subsampling: the uncovered part uses the original data of K-space, and the covered part is filled with a random matrix Z whose elements are uniformly distributed. This subsampled K-space is input to the first generator network G_k, which generates K^i_G. Since the original data of the subsampled K-space is known, only the values of the area covered by M in K^i_G are used here, and the uncovered values are directly gained from the original K-space. After the above steps, the restored K-space K^i_R is finally obtained, and the I^i_R is further obtained after IFT. The image I^i_R is imported in the next generator network G_{im} as input, and the output image \tilde{I^i_R} and the original fully sampled image I^i_O are input to the discriminator D.

Both generator networks are U-nets, which are autoencoders composed of convolutional layers. The discriminator is a CNN.

Loss-functions

The loss function of the generator G_k is the L2 norm of the pixel-wise difference between the output K-space and the original K-space.

L_k(\theta) = {\|K_G - K_O\|}_2 = {\|G_k(K_p; \theta) - K_O\|}_2

The loss function of the generator G_{im} consists of two parts:

  • The weighted summation of the L1 and L2 norms of the pixel-wise difference between the original image and the reconstructed image

          Note: F^H(\cdot) means the IFT function.

L_{IM}(\theta, \phi) = {\|I_O - \tilde{I_R}\|}_p = {\||F^H(K_O)| - G_{IM}(|F^H(K_P + \overline{M} \bigodot G_K(K_p; \theta))|; \phi)\|}_p
  • Adversarial loss derived from the two-player min-max value function of the GAN network
L_A(\theta; \phi) = \mathbb{E}(log[1 - D(G_{IM}(|F^H(K_P + \overline{M} \bigodot G_K(K_p; \theta))|; \phi))])

The loss function of the discriminator is not explicitly stated in the article. According to the source code provided by the authors on Github, the loss function is a typical binary cross entropy.

Experiments and results

Performance evaluation parameters

PSNR, SSIM [3]: These two are commonly used to measure the quality of reconstructed images. Both are obtained by comparing the original image and the reconstructed image pixel by pixel.

Although the above two indicators are often used in image processing scenarios, they cannot actually evaluate clinical usability. Therefore, other indicators are used in this article to quantitatively evaluate the similarity of anatomical structures. These evaluation indicators are not applied to the whole image. Instead, the original image and the reconstructed image are first processed with a specific image segmentation tool and then compared. Image preprocessing used includes: skull stripping, brain tissue and lesion segmentation.

Dice Score: Evaluates the overlap between two regions.

MHD: The "modified Hausdorff distance", which evaluates the average distance between the boundaries of two regions.

Comparison research

This network is compared with the following three methods:

Zero-filled K-space: the subsampled K-space is directly used and the unsampled part is filled with 0. This score is used as the lower boundary of the performance.

CS-MRI: Using the traditional method of "compressed sensing"

ADMM-CSNet: Deep learning method based on "compressed sensing" and "Alternating direction method of multipliers"

In addition, the authors also conducted ablation studies:

Image-space GAN: It simply reconstructs the image domain (G_{im}+D) and cancels the GAN used for K-space.

K-space GAN: Only the reconstruction of K-space (G_k+D) was performed, and the subsequent refinement steps in the image domain were not performed.

Datasets

This research involves three datasets:

IXI Dataset: This dataset contains brain scans of healthy adults

2015 Longitudinal MS Lesion Segmentation Challenge [4]: Brain scans of multiple sclerosis patients is included, using three sequences (T1-w, T2-w, and FLAIR) for sampling.

DCE-MRI Dataset: The private DCE-MRI dataset from Soroka University Medical Center contains scans of stroke and brain tumor patients.

The first two are static scans, that is, multiple adjacent slices are scanned (approximately) at the same time. The DCE-MRI data set is a dynamic scan of the same plane, that is, the same plane is repeatedly scanned over time.

Results

IXI Dataset

For the IXI dataset, the article provides Figure 2 as visualization of experiment results (sampling rate 20%, that is, only 20% of the original K-space data is used), and provides a quantitative comparison (Table 1 and Table 2) of the whole image using PSNR and SSIM.

Figure 2: Examples of reconstructed MR images from undersampled k-space data using only 20% of the data

In addition, the authors preprocessed the scanned images: the skulls are removed by applying the Brain extraction tool (BET) [5], and used MHD alone as an indicator for comparison of the brain part.

Table 1: Reconstruction fidelity at different subsampling ratios

Table 2: Comparison of MHD after skull removal

Furthermore, the authors segmented the brain image by using FSL-FAST [6][7], divided the brain tissue into three parts: white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF), and used Dice score as an indicator for comparison (Table 3).

Table 3: Dice scores calculated for the segmented reconstructed MRIs

2015 Longitudinal MS Lesion Segmentation Challenge

For the second dataset, the article provides the experiment results (sampling rate 20%) as shown in Figure 3 and Table 4, and provides a quantitative comparison of the whole image using PSNR and SSIM. As mentioned earlier, the dataset contains four sequences, T1, T2, PD and FLAIR.

Figure 3: Examples of multi-modal reconstructed MR images of MS patients

Table 4: PSNR and SSIM calculated for the reconstructed multi-modal MS-lesion MRIs

In addition, in order to show the clinical availability of reconstruction results, the authors performed lesion segmentation, which is based on a DL tool presented by Shwartzman et al [8]. The authors also provide manual segmentation as a reference (Ground Truth). A visualized result is shown in Figure 4. The quantization results in Table 5 first compare the segmentation results of the Shwartzman's tool with the GT, and then compare the segmentation results of the reconstructed image using the DL tool and the fully sampled image using the DL tool.

Figure 4: Annotations and the lesion segmentation based on reconstructed scans

Table 5: Dice scores calculated for the lesion segmentation in reconstructed MRIs with respect to the GT and with respect to fully sampled scans

DCE-MRI Dataset

The DCE-MRI dataset contains images of patients with brain tumors and strokes, including images before, during, and after the patient was injected with a contrast agent (CA). According to the article, the CA can change the T1 relaxation time [9]. The authors demonstrated the integrity of the blood-brain barrier (BBB) by detecting the CA concentration curves at different positions (a dysfunctional BBB indicates the existence of lesions) to verify the clinical availability of the reconstructed image. The visualization is shown in Figure 5. The red, purple and green squares in (a) mark arbitrarily selected healthy, vascular, and pathological regions, respectively.

Figure 5: CA concentration curves calculated based on fully sampled scans and reconstruction methods

In addition, the authors introduced the pharmacokinetic parameter k_{trans} as another verification of clinical usability based on the Tofts model [10][11]. The red area in Figure 6 (a) represents local high permeability, which also means BBB failure. A frame-by-frame comparison of PSNR and SSIM is also shown in Figure 6 (b-c).

Note: t_x means the x-th frame of the DCE-MRI sequence. k_{trans}parameters are calculated based on the t_{10}.

Figure 6: k_{trans} maps based on reconstructed scans and the comparison of image fidelity frame-by-frame

Conclusion

In the article, the authors introduced an subsampling MRI reconstruction tool implemented by GAN. According to the experimental results in the article, the performance of the proposed network exceeds the classical CS-MRI method and its deep learning implementation. The authors claim that the network is "real-time software-only".

In addition, the author also pointed out: Although this article uses various parameters to prove that the reconstructed image is high-fidelity and clinically usable, it could also be possible to directly reflect the clinical usability in the loss function, which is, to find a loss function which can directly quantified clinical usability.

Student review

This article clearly describes the network structure and proves its performance with detailed experimental data. And the author also considered the feature of medical imaging: not only from the perspective of image fidelity, but also fully considered clinical usability.

However, the article also left us with some questions. For example, the authors did not mention the network training process (time and hardware resources). And the dataset used in the experiment actually lacks diversity, so in fact there is no data to prove this network is also suitable for MRI imaging of other body parts, especially parts with large individual differences. In addition, even it is assumed that the framework is universal between different body parts, but it may also need to use specialized datasets for each part for training.

Reference

[1] Lustig, M., Donoho, D., & Pauly, J. M. (2007). Sparse MRI: The application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine58(6), 1182-1195.

[2] Yang, Y., Sun, J., Li, H., & Xu, Z. (2018). ADMM-CSNet: A deep learning approach for image compressive sensing. IEEE transactions on pattern analysis and machine intelligence.

[3] Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing13(4), 600-612.

[4] Carass, A., Roy, S., Jog, A., Cuzzocreo, J. L., Magrath, E., Gherman, A., ... & Crainiceanu, C. M. (2017). Longitudinal multiple sclerosis lesion segmentation data resource. Data in brief12, 346-350.

[5] Smith, S. M. (2002). Fast robust automated brain extraction. Human brain mapping17(3), 143-155.

[6] Zhang, Y., Brady, M., & Smith, S. (2001). Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE transactions on medical imaging20(1), 45-57.

[7] Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W., & Smith, S. M. (2012). Fsl. Neuroimage62(2), 782-790.

[8] Shwartzman, O., Gazit, H., Shelef, I., & Riklin-Raviv, T. (2019). The Impact of an Inter-rater Bias on Neural Network Training. arXiv preprint arXiv:1906.11872.

[9] Tofts, P. S., & Kermode, A. G. (1991). Measurement of the blood‐brain barrier permeability and leakage space using dynamic MR imaging. 1. Fundamental concepts. Magnetic resonance in medicine17(2), 357-367.

[10] Tofts, P. S. (1997). Modeling tracer kinetics in dynamic Gd‐DTPA MR imaging. Journal of magnetic resonance imaging7(1), 91-101.

[11] Tofts, P. S. (2010). T1-weighted DCE imaging concepts: modelling, acquisition and analysis. signal500(450), 400.


  • Keine Stichwörter