Dokumentation
Segmentation of Medical Images with Deep Learning
A bachelor thesis from Vincent von Büren
With the increasing amount of data and rising costs in the medical environment, more and more Deep Learning applications are finding their way into the industry.The segmentation of biomedical images is becoming increasingly important in this context. The extraction of meaningful patterns can lead to improvements in medical treatment, but also to more efficient investigation of diseases and so counteract the cost pressure in the healthcare industry. One of the most influential Deep Learning models for biomedical segmentation tasks is the U-Net architecture.
As part of my bachelor's thesis, I implemented this architecture and tested it for its applicability to another biomedical dataset. In order to evaluate the model with respect to its capabilities, four common metrics for image segmentation were introduced. In a first implementation, the U-Net architecture was tested without Data Augmentation, but with the same parameters as in the original configuration. In a next step, Data Augmentation was taken into account. In order to examine if the architecture's performance can be improved, the model was combined with additional Residual Blocks and densely connected blocks. The resulting methods were evaluated with the same predefined metrics. Finally, the U-Net was examined and evaluated with regard to different network depths and filter sizes.
It was shown that a reimplementation of the original network can achieve remarkable results on a similar biomedical dataset. Furthermore, it was also found that leaner architectures in regard to network depths and filter sizes could achieve equal or even better results than their deeper counterparts.
For the implementation TensorFlow's high level API Keras was used.
Architecture:
In the illustration below, the U-Net architecture is presented. The architecture consists of a contracting and an expanding path. So there is an encoder and decoder path. The images get downsampled until the bottleneck layer. With the downsampling the number of filter increases. Each blue bock is a convolutional block. In the downsampling path, there is a max pooling operation before each new downsampling. In the expanding path, the upsampling is made with transposed convolutions. In order to combine the feature extraction with the location, the features maps in the contracting path get concatenated at the same level with the feature maps in the expanding path.
Illustration: U-Net architecture
In addition, to the original U-Net model, not only did I implement an architecture which consists of additional Residual Blocks for each layer of the U-Net, but also a densely connected architecture which is shallower than the original version. The illustrations below present the additional building blocks.
Illustration: Densely connected U-Net block Illustration: Residual block in U-Net layer
Dataset:
The dataset chosen for this thesis comes from the 2018 Data Science Bowl Nuclei Challenge. The dataset contains a large number of segmented nuclei images. The images not only vary regarding cell type, but also regarding magnification and image modality. The training set consists of 670 train and 65 test images. But only the 670 training images also have a segmentation mask for latter comparison.
Illustration: Dataset examples with original image and segmentation mask
As one can see the illustrated images, the images vary in pixel intensity. About 58% of the images have a pixel intensity of under 50.
Results:
In this thesis, the first goal was to do an U-Net implementation according to the original configuration. This means that the same training parameters as in the paper described were used. In a next step, the training was updated with a bigger batch size. The batch size for the second iteration is 4. This illustration shows the scores for the common segmentation evaluation metrics. In addition, there is a comparison between the scores with batch size 1 and batch size 4.
Illustration: Evaluation of U-Net models with batch size 4 vs batch size 1
Finally, the models were also examined in regards to complexity. The model complexities were reduced by changing the architecture depths and the filter sizes. This last illustration shows the results for the shallower versions with respect to the deeper counterparts.
Illustration: Evaluation of shallower U-Net models with comparison to deeper counterparts