1. OUTLINE

Outline
Motivation
Model-Based Approaches
1. Fractal algorithms
2. Space-Filling algorithms
Data-Driven Approaches
1. Generative Adversarial Networks
2. Varational Autoencoders
Review
1. Comparison
2. Conclusions
References

2. MOTIVATION

According to data from [1], coronary disease was the biggest killer worldwide in 2019 with over 9.1M deaths. It is estimated that over 60M people in the world develop a heart or circulatory-related disease every year, being the most common ischemic heart disease (coronaries), peripheral arterial disease and stroke.

Most of these cardiovascular diseases originate in a remodeling of the blood vessels driven by many different underlying mechanisms. In order to develop therapies for these problems we need to fully understand how these mechanisms work and affect the physiology of our circulatory system. [2]

Computational models and artificial intelligence (AI) could become some of the most powerful tools at our disposal. These models could be used to gain insights into the pathology, help radiologists with the diagnosis and prognosis of the disease as well as assisting surgical personnel during interventions, follow-ups and training.

The cornerstone for all of the mentioned applications is medical data; it is the most crucial puzzle piece and it has proven over time to be one of the main bottlenecks in the application of AI in healthcare [3,5]. Medical data is notorious for having many issues such as lack of availability, class imbalances, data quality, privacy concerns, sensible patient information leaks and the cost of annotations by trained professionals. At a time when there are incredibly powerful algorithms at our disposal, the demand for medical data is at an all-time high.

One of the most commonly used techniques to increase the amount of available information is data augmentation. This approach consists on the application of transformations to the available data such as random crops, rotations or flips (when talking about medical images). Although this has proven to be very useful over time, AI algorithms would benefit a lot more from newly synthesized, realistic and anatomically accurate medical data.

To this end, there has been an effort to develop models that can synthesize blood vessels, capturing the variability in terms of size, shape and structure of the real samples while providing a certain degree of control on the end result; we can distinguish two different types of approaches, model-based and data-driven.

3. MODEL-BASED APPROACHES

3.1. Fractal algorithms

Lindenmayer systems (L-systems) were first discovered as a mathematical theory for explaining plant development. The central concept of these systems is rewriting. A sequence of characters defining the object is iteratively updated following certain rewriting rules which are applied simultaneously to all the letters in the given word.

For plant modelling, we can use the generated sequence as a set of instructions for an agent to travel through space, growing the branching plant-like structure along these trajectories. For instance, a simple set of instructions for 2D would be:

F Move forward; + Turn left by angle δ; − Turn right by angle δ; [ Branch starts; ] Branch ends

Figure 1. Tree-like structures based on simple L-systems [6]

The previous examples show the recursive generation of branching systems in nature such as trees or bushes, but interestingly these systems show similarities to branching systems in biology such as vasculature, neural systems, lung bronchi ramifications, etc.

Based on this premise in [7] they developed a system where the generation of vasculature follows a distribution of branches dictated by the formalisms of the L-systems in 2D.

Axiom Ω: X(L,W)

Rewriting Rule P: X(L,W) → F(L,W)[-(δ1)X(L,W)][+(δ2)X(L,W)]

Figure 2. L-system-based vessels [7]

As it can be seen in Figure 2, the results of this work show a wide spectrum of tree-like structures that capture the theoretical properties of real arterial branching systems. Certain real branching systems can be modelled with this algorithm. Figure 2 Image A represents ‘delivering vessels’ that extensively branch to reach the capillary bed, while images D or E (Figure 2) are closer to distributing vessels, such as in coronary circulation where the main coronary arteries circle the heart and secondary branches arise with high branching angles.

In a similar work, [8] they try to solve the mentioned problem by introducing anatomical constraints in order to generate more plausible arterial trees while using a stochastic and parametric L-system in 3D. In this system they define the same rules as in the previous paper but adapted to work on a three dimensional space. They introduce limitations on the physical space where the arterial tree may grow using real organ segmentations as well as a randomness factor in the construction of the tree.

Figure 3. Maximum Intensity Projection (MIP) images of L-system vessels with boundary conditions [8]

This approach proves to be a simple yet effective way of generating 2D and 3D structures that resemble vessel trees. The simplicity of the algorithm -a symbol sequence represents the complete tree- makes it very attractive in terms of computing cost. However, these models also come with certain drawbacks. When there are constraints implemented, the time it takes to create the 3D structures can be upwards of 50 minutes such as in the liver case - 200.000 points in the surface-. Also, there is not direct way to control the generation process and the only way to obtain results that resemble real structures is by trial and error tuning the parameters and by implementing boundaries. These results are also limited in terms of realism, as although real vessels resemble fractal trees, the creation of real vessels depends on many other physiological factors which are ignored by this method. Lastly, this method is really limited in terms of the diversity of the created geometries; one set of parameters will always yield the same output, and this can only be minimized by introducing a randomness factor in the tree growth.

3.2. Space-Filling algorithms

Another popular model-based approach is using space-filling algorithms [9, 10]. These models see free space (tissue without vessels) as a set of discrete points named attraction points, and vessel nodes. The arterial tree is represented as a directed graph of nodes and edges. Each vessel node decides how it should grow to end up reaching the attraction points it sees. Similar to the looping nature of fractal growth in L-systems, these models develop in a feedback loop of growing and creating new attraction points. The algorithm proposed in [9] develops the arterial and venous systems in parallel iterating over 5 steps:

Sampling of new attraction nodes representing oxygen drains or hypoxic tissue.
Growth of arterial blood vessels while being geometrically constrained. Try to supply oxygen to the hypoxic tissue by growing towards attraction points.
Oxygen drains that are sufficiently supplied are now added as carbon-dioxide sources, which serve as attraction points for the venous system
Growth of the venous tree.
The overall tissue undergoes growth, modelling organ development.

This algorithm is able to generate vessel trees in milliseconds which can be turned into 3D triangle meshes (following the method proposed by [14]) or 2D surface textures.

Figure 4. Results taken from [9] showing rendered images of a colon and eye with vessel trees synthesized by their method in under a second.

Video_demonstration_space_filling.mp4

Video 1. Video taken from [9] showing the space filling algorithm in different scenarios

Following this train of thought, in [11] Martin J. Menten et al. developed a space filling based algorithm for the synthetic generation of Optical Coherence Tomography Angiographies (OCTA). OCT images are crucial for the diagnosis of ocular diseases that manifest as changes in the vessels of the retina. With recent advances in deep learning, semantic segmentations of such images could help ophthalmologists diagnose these diseases much more efficiently.

Similar to the method proposed in [9], the cited authors once again represent vasculature as a directed graph of nodes and edges which is grown iteratively. The construction of the vascular trees is guided by concentration maps of oxygen and endothelial growth factors (VEGF) calculated at each step. VEGF-rich regions are related to low perfusion and in order to supply oxygen to these hypoxic regions, new vessels are developed.

The inner retina is supplied by 4 vascular plexuses, which can be grouped into the superior vascular complex (SVC) and the deep vascular complex (DVC). In order to create vascular trees that mimic real OCTA images, they first generate the SVC with vessel stumps distributed at the edges of the retina, which will later grow towards the center. Once this is done they implement the DVC as smaller vessels which penetrate deeper into the tissue and originate from the SVC. Lastly, they enforce regions with low VEGF to stop angiogenesis at the center of the retina (foveal avascular zone) and the outer retina layers.

In order to train AI algorithms using these synthetic images, they first need to be similar to real OCTA images. To this end, once the 3D geometry has been generated, they apply several processing techniques to obtain 2D grayscale images containing the most common OCT physics-based artifacts.

Figure 5. Synthetically generated retinal graphs [11]

Due to the fact that the created images have intrinsically matching ground truth labels, they could easily train segmentation algorithms with the synthetic data to evaluate their results. The obtained results showed that, qualitatively, the segmentations produced by the networks trained using synthetic images improved with respect to the same networks trained on real data. They also observed that state of the art network’s performance quantitatively improves after pre-training on synthetic data before fine tuning on real samples.

Figure 6. Comparison of segmentation results and ground truth [11]

Once the model of the retinal vasculature is obtained, they convert the 3D geometry into 2D synthetic OCTA images. This conversion is done through MIP and by introducing the most common artifacts found in real images. These artifacts include flow projection, eye motion and vitreous floater artifacts as well as background signal caused by capillary vessels.

This method was able to generate highly realistic, ground truth matching, synthetic images with great anatomical detail using a space filling physiology based model. These algorithms are a great alternative to data-hungry deep learning networks as they don’t require any training or ground truth data and the computational cost is extremely low (in the range of milliseconds for [9]). Also, the resulting geometries show great realism due to the nature of the algorithm, following a realistic approach to vessel growth using attractor points in space. The main drawback of these models is that they lack variability in terms of capturing the diversity of real data.

4. DATA-DRIVEN APPROACHES

4.1. Generative Adversarial Networks

Data driven algorithms are a powerful alternative to model-based approaches as they can learn to capture the complexity of real data and produce anatomically diverse samples. Generative Adversarial Networks (GANs) were first introduced by Ian Goodfellow [12] in 2014 as a combination of 2 networks, a generator and a discriminator, that work in competition. The generator tries to synthesize new data that follows the real data distribution starting from a random noise sample while the discriminator tries to classify samples into real or fake, providing supervision for the generator training. This framework can also be applied for our scenario and in 2018 Wolterink et al. [4] proposed a GAN for generating synthetic coronary artery vessels.

One of the main problems when using deep learning for generating 3D data is enforcing that the resulting volumes are contiguous, that is, without any holes or continuity errors. Memory issues are also a problem and the synthesized volumes are often restricted in size. In order to overcome these issues, they propose a parametrization of the 3D geometry as a 1D 4-channel sequence. The proposed parametrization represents the vessel as a sequence of 4 dimensional data points. Each point encodes information representing the centerline of the vessel in space and its radius (x, y, z, r).

The adversarial neural network consists of a generator (G) and a discriminator (D). The generator takes a noise vector sample z and turns it into the 4 dimensional parametrization of the artery G(z) which is fed to the discriminator to classify as real or fake, alongside real examples of artery parametrizations. The proposed framework also allows the generator to take an additional input describing characteristics about the desired synthesized artery. They are both trained at the same time, trying to jointly optimize the objective function.

This function differs from the traditional cost function of a GAN as it includes a different distance function (Wasserstein distance [13]) for the generator as well as the introduction of conditional training, allowing for more control over the result of the generation.

To validate their results, they generated over 4412 samples and compared their length, volume and tortuosity to the training set (4412 samples as well).

Figure 7. Results from the Generative Adversarial Network [4]

This paper shows the potential that deep learning models have for synthesizing medical data, however it has a few shortcomings. First of all, the generated arteries have no bifurcations or ramifications, they only represent the main coronary artery branch. On another note, this model is limited by the same problem it tries to solve, it requires medical data for training, and as mentioned at the start of this post, it is extremely scarce. On the other hand, it provides a great amount of control over the end result as the generation can be conditioned towards specific characteristic. Lastly, as the generator samples from a random distribution, it is able to synthesize a huge variety of results, correctly capturing the complexity and diversity of real data and overcoming one of the main drawbacks of model-based approaches.

4.2. Variational Autoencoders

Lastly, another more recent example of a data driven model implementation for the generation of 3D synthetic vessels was proposed by Paula Feldman et al. [5]. In this work, the proposed model is a Variational Autoencoder (VAE), based on a Recursive Variational Neural Network. Unlike the previously mentioned GAN model, the proposed algorithm is able to synthesize multi-branch blood vessels.

There are certain similarities to the presented model-based space-filling methods where the arterial tree was understood as a graph; here we have a tuple of nodes and edges (T, E) in which a node represents a vessel segment V. In a similar fashion to [4], the 3D geometry of the vessel segments is reduced to simpler representation. Each vessel segment consists of ordered 4D points encoding the position in space and the radius of the artery at that point.

The model architecture consists of an encoder and a decoder whose main objective during training is to recover the input after it has gone through both networks. The encoder converts a tree structure into a learnt representation that follows a certain probabilistic distribution. The trained decoder can sample from this distribution to obtain a new synthetic tree structure.

The architecture of the decoder is not straightforward, it has 2 main components, a node classifier and a feature decoder multi-layer perceptron (MLP). The node classifier tries to predict the nature of a synthesized node by classifying it into 3 classes, leaf node, internal node with one bifurcation and internal node with two bifurcations. Then, there is the features decoder MLP which predicts the attributes of the nodes such as position and radius. There are two additional components, the right and left decoder MLPs which recursively decode the next node in the tree. They do not always participate, only when the classifier determines the current node to be a bifurcation. Lastly they also present the addition of 3 auxiliar fully connected networks which appear before and after the bottleneck and help with the shaping of the latent space into the desired distribution.

The proposed loss function consists of 3 terms. The reconstruction term is minimized by training the features decoder MLP which reconstructs the input tree while the topology objective is minimized by correctly predicting the class of the nodes, represented by a 3-class cross entropy loss. Lastly the KL divergence term is used to shape the latent space and bring it closer to the standard normal distribution.

Figure 8. Results taken from [5]. (a) shows the histograms of total length, average radius and tortuosity per branch for both, real and synthetic samples. (b)shows a visual comparison among their method and two baselines.

This last method presents another alternative to the model-based approaches, and similarly to the GAN, it overcomes their limitations (lack of control over the result and variability of the synthetic structures). The main drawbacks are that, in order to train such algorithms, we need medical data which is limited. Furthermore, the computational cost of training and then using these models is extremely high when compared to the streamlined space filling algorithms.

5. REVIEW

5.1. Comparison of Approaches

In Table 1 we can see a comparison of the previously mentioned methods. The Control category measures the ability of the user to control the result of the synthetic generation; variability represents the ability of the model to capture the diversity of real data by being able to synthesize new cases each time. Realism is measured by the anatomical relevance of the results, measuring whether the resulting vessels represent plausible real geometries. Computing costs review the time it takes to synthesize new samples but also the costs associated with the training process and finally, training data reflects if the algorithm needs medical images in order to be trained.

Table 1. Comparison overview of the presented methods

*The results can be conditioned by the placement of the attraction nodes, like in the case of the center of the retina in [11].

**The generated coronary arteries do not have bifurcations and ramifications; it only models the main artery.

***The cost to create 3D images is high when the anatomical constraints are used.

In my opinion, I find that the fractal L-system model results, as reported in [8], lacked critical evaluation. The authors of this paper claim their method successfully synthesizes vasculature, yet they offer no validation to support this affirmation. It would have been interesting to see some sort of validation of the MIP images, by pre-training a classification algorithm for instance. This approach would align with the methodology employed in the study by Menten et al. [11]. In the mentioned paper they pre-trained several segmentation networks with synthetic images and presented the quantitative results, proving the potential of their space filling model in a real-world scenario. This paper in my opinion was the most complete. Their model generated anatomically relevant realistic images and they convincingly demonstrate its practical utility in a real scenario. Lastly, their representation of vasculature was converted into images with realistic artifacts, fulfilling all of the main objectives of these models.

In contrast, the data-driven approaches detailed in [4, 5] primarily focus on creating 3D geometry, neglecting the generation of realistic synthetic images. This raises questions about the suitability of their results for applications in deep learning or surgical training, where realistic synthetic data is crucial. Their way of evaluating the synthetic vessels is also flawed as it relies merely on a statistical analysis. The methodology of the GAN model is particularly concerning, in this work the researchers compare their results against their training set.

5.2. Conclusion

This post gives a brief overview of synthetic vessel generation by model-based and data-driven methods. The presented algorithms are powerful tools that can help researchers in many different applications.

For tasks where the geometries are used as cosmetics, such as educational videos, or training simulators, the more streamlined model-based algorithms can be extremely useful. Fractal based models lack the anatomical realism in larger, more recognizable structures such as the coronary arteries, but can create great results in endothelial vasculature where the real vessels are similar to fractal trees.

The mentioned space-filling algorithms achieve great results, with a certain degree of control and with extremely low computational requirements. These models can be useful for surgical training and 3D environment renderings but also as data augmentation algorithms.

Lastly, when the task in hand requires more variability and control over the synthesized geometries, data-driven models have the potential to take the upper hand. Still there is several issues to be solved with these models. Although they do provide a way to control de results and capture the complexity of real data, the application of these results remains yet to be seen. They focus on the generation of a 3D geometry that has similar statistical attributes as real data, but there is not direct way of utilizing these results to improve the performance of data-hungry networks or medical staff training.

6. REFERENCES

British Heart Foundation. (n.d.). CVD statistics - global factsheet. Retrieved from https://www.bhf.org.uk/-/media/files/for-professionals/research/heart-statistics/bhf-cvd-statistics-global-factsheet.pdf?rev=f323972183254ca0a1043683a9707a01&hash=5AA21565EEE5D85691D37157B31E4AAA
Lust, S. T., Shanahan, C. M., Shipley, R. J., Lamata, P., & Gentleman, E. (2021). Design considerations for engineering 3D models to study vascular pathologies in vitro. Acta Biomaterialia, 132, 114-128.
Mosqueira-Rey, E., Hernández-Pereira, E., Bobes-Bascarán, J., Alonso-Ríos, D., Pérez-Sánchez, A., Fernández-Leal, Á., & Moret-Bonillo, V. (2023) Addressing the data bottleneck in medical deep learning models using a human-in-the-loop machine learning approach. Neural Computing & Applications. https://doi.org/10.1007/s00521-023-09197-2
Wolterink, J. M., Leiner, T., & Isgum, I. (2018). Blood vessel geometry synthesis using generative adversarial networks. arXiv preprint arXiv:1804.04381.
Feldman, P., Fainstein, M., Siless, V., Delrieux, C., & Iarussi, E. (2023, October). VesselVAE: Recursive variational autoencoders for 3D blood vessel synthesis. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 67-76). Springer Nature Switzerland.
Prusinkiewicz, P., & Lindenmayer, A. (1990). The algorithmic beauty of plants. Springer-Verlag.
Zamir, M. (2001). Arterial branching within the confines of fractal l-system formalism. The Journal of General Physiology, 118(3), 267–276.
Galarreta-Valverde, M. A., Maysa M. G. Macedo, Choukri Mekkaoui, Marcel P. Jackowski. (2013). Three-dimensional synthetic blood vessel generation using stochastic L-systems. Medical Imaging 2013: Image Processing, 8669, SPIE.
Rauch, N., & Harders, M. (2021). Interactive Synthesis of 3D Geometries of Blood Vessels. In Proceedings of Eurographics, 2021.
Talou, G. D. M., Safaei, S., Hunter, P. J., & Blanco, P. J. (2021). Adaptive constrained constructive optimisation for complex vascularisation processes. Scientific Reports, 11(1), 6180.
Menten, M. J., Paetzold, J. C., Dima, A., Menze, B. H., Knier, B., & Rueckert, D. (2022, September). Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of oct angiographs. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 330-340). Springer Nature Switzerland.
Goodfellow, I. (2014). Generative adversarial networks. arXiv preprint arXiv:1406.2661.
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN. arXiv preprint arXiv:1701.07875.
Y. Hijazi, D. Bechmann, D. Cazier, C. Kern and S. Thery, "Fully-automatic Branching Reconstruction Algorithm: Application to Vascular Trees," 2010 Shape Modeling International Conference, Aix-en-Provence, France, 2010, pp. 221-225, doi: 10.1109/SMI.2010.34.

Seitenhierarchie

Synthetic vessel generation