Nowadays medical Data doubles every 3 years, so it became impossible for doctors to analyze all the data belonging to a patient. Therefore a new approach has to be found to assist a doctor in diagnosing and treating a patient. In modern society nearly everyone has access to the internet and is connected to each other. That is why many companies start to make use of this data with machine learning approaches like neural networks to develop systems that can access all the data and generate hypothesis out of it. These systems go away from the traditional approach of a rigid decision tree to cognitive systems. This chapter covers some approaches for medical software and the ethical questions coming with them.
Diagnosing through Data Processing
IBM Watson
Other Examples
A project by the google company alphabet is called project baseline [2][3]. Their aim is to find out how health is defined. The project is conducted with the cooperation of the Stanford and the Duke University. Over a period of four years they are seeking around 10,000 people to represent different ages, backgrounds, and medical histories. The job of the participants is to visit a study site for a health test up to 4 times a year, wear the latest medical tracking devices and take part in surveys about their health status and lifestlye. With this huge amount of data gathered, they hope to find out by which criteria perfeclty healthy human is defined.
The chinese equivalent to google baidu produced a software called Minwa similar to the IBM Watson model relying on natural language processing. It became famous after it beat google and microsoft in an image classification challenge. However the sucess was not long lasting, after they admitted on June 1, 2015 that it hat broken rules governing the challenge.
Health Tap launched an app called Dr. A.I. to which one can desrcibe his symptoms. The app then asks follow-up questions based on your answers and finally compiles a list of the most likely causes of the symptoms ranked by the order or seriousness. Furthermore the app tells the consumer if he should head to a doctor or not. However the app is apparently not available on the european google play store [15].
The Cloud
Cloud computing has made data and the processing of it more ubiquitous, efficient and accessible. It offers access to data with geographical transparency and the protection against accidental deletion of data. Cloud-based systems are a key part of all of our personal lives, from Internet email services to wearables and app platforms. Whether in paper or in digital format, the privacy of health information in the US is protected by the Health Insurance Portability and Accountability Act. However, saving medical records and private health data in the cloud can be risky. For example, Eastern European hackers broke into Utah’s state health records database, gaining access to personal information on 780,000 patients including some 280,000 social security numbers [4]. But there are some misconceptions as well regarding cloud storage [5]:
- The provider owns patient data: many people worry that if their data is stored in the cloud they can no longer influence this. However there is a standard agreement between cloud storage vendors and healthcare providers where the former is obligated to return all data when the provider wishes to end the agreement.
data stored on-site is more secure than cloud: keeping data close to home does not necessarily mean high-security levels. In fact, the chances of enterprise data centers suffering a malware/bot attack are four times greater compared with a cloud-hosting provider. Furthermore the data is stored with high level encryption technigues, and is therefore harder to understand than just a sheet of paper.
The biggest ethical challenge is to keep the right of privacy and confidentiality. So information of a patient should only be released to others if the patients agrees to it. If the patient is unable to decide for whatever reason the decision can be made by the legal representative. These informations are considered confidential and must be protected. One approach to do so is to allow only authoridzed individuals to have access to information. If the identity of the patient can not be determined from the information the privacy confinement holds no longer. So it is a good way to strip medical information of all personal identifiers before the data can reach the cloud.
Another danger is that the data stored at the cloud can become inaccurate due to improper use of options such as “cut and paste”. So the system processing the cloud might receive inaccurate data and therefore predict a wrong diagnosis [6].
Cloud Computing is on the rise and has proven to assist doctors to offer the best treatment for a patient. Though it comes with certain risks, multiple strategies are available to reduce them and overcome barriers in the implementation of digital health records
Ethical And Legal Issues
The current legal status in the US is that doctors are still accountable for their decision making. The main reason is that machine learning diagnose tools are regarded as an additional information source and are excluded from FDA oversight by the 21st Century Cures Act. However, the clinical decision support coalition has released some voluntary guidelines[16][17]:
- Transparency: CDS software should give clinicians the clinical evidence supportin their recommendation
- Competent end users: it should only by used by a trained expert in the field
- Time: the software should not be used to make split-second diagnoses
- Access to outside information: the doctor should consider other ressources as well
One can not neglect that artificial intelligence is on the rise, be it with Tesla's self driving cars or Amazon's Alexa. Currenty medical misdiagnosis is the third leading cause of death in the US and machine diagnosis shows potentail to drastically reduce this rate. With more competent system the questions arises if the responsibility might shift one day. If an algorithm has a higher accuracy rate than the average doctor it seems wrong to place the blame on the physician. Furthermore doctors who are scared of malpractice tend to order more tests than necessary which would reduce the costs for insurances. Even right now studies show that this is not always the best approach to protect patients. A recent study showed that patients where doctors faced an increased risk of malpractice were 22 percent more likely to become e.g septic [18].
One could hold the system itself accountable, however it is hard to actually punish a software or demand money from it, so this would leave affected patients unsatisfied. Another approach is to accuse the designers of the softwarte, but since it’s often impossible to understand the decision making of an AI even for its designers, this does not seem right as well. The last option is to hold the organization behind the software responsible. This would result in a clear target, however it is unclear if companies then would stay in the market if they have such a high risk.
So in order to realize the benefits of AI in healthcare, the matter of responsibiliy still needs to be clearly defined. There will be no easy solution but not finding one will undermine patient trust, place doctors in difficult positions, and potentially hinder further investments [19].
Current legal situation in the US
Currently, the US agency FDA, which is responsible for approving medical therapies and drugs for the US market, has no regulation formalities for artificial intelligence assisted medical decision systems yet. Each case have been treated as individual cases yet, so that approving software systems were made with regulation guidelines like the regulations for mobile medical applications, which is the closest comparable document at the moment. In between, the 21st Century Cures Act excluded medical software from FDA's jurisdiction, but this state is unclear. There exists a draft guidance document with the name "Support Regulatory Decision-Making for Medical Devices", which gives an idea about how the FDA would analyse products falling into this category. For the future, the FDA is creating a new unit for digital health with experts in AI, software engineers and medical representatives, who are responsible for analysing and creating regulatory guidelines for medical support systems based on AI and machine learning.[20]
Current legal situation in the EU/Germany
The EU regulation dogma of medical devices defines that „any instrument intended by the manufacturer to be used for human beings for the purpose of, among other things, diagnosis, prevention, monitoring, treatment or alleviation of disease“ is considered by the regulation. It does not exclude software, so assistance systems with AI/machine learning, which have a medical purpose, can be regulated by this principles. Nevertheless, software has to be analysed case-by-case in terms of a regulation. [20]
But the assessment on software systems is in no ways comparable to the assessment on medical devices, as software does not directly affect a human’s body function.
The EU MDR was recently updated, with changes in classification of some medical products. It will become law 3 years after publication. This update was enforced by members of the European parliament, which wished a regulations in terms of robotics and artificial intelligence, especially in ethical standards and establish liability for accidents involving both the technologies.[21]
Arterys, an example for a FDA approved deep learning application
For the cause of helping doctors with diagnosing heart problems, a company named Arterys built a medical imaging platform, which is based on a self-learning artificial neural network. It helps doctors to understand how the heart is working by providing accurate measurements of blood ventricles. The FDA approved the software platform, as it passed tests which guaranteed that it can be at least as accurate as humans. But it overperforms medical representatives in terms of time: A doctor needs at least 30 minutes to one hour for a result, whereas the Aterys platform gave a diagnosis several seconds later. As the platform is a cloud based-deep learning application, the FDA approved it only after examining it according to data security standards, where Arterys uses strong data anonymization techniques in combination with secure transfer protocols and encryption. Overall, this is a very big step for further approvals of machine learning backed medical applications.[22]
Arteryx' software platform analyses blood components
automatically using deep learning [source]
Machine Learning
The goal of Machine Learning is to find a model y(x,w) that maximizes the probability p(z_i | x_i, w) for a given measured feature x_i and a target output z_i. Machine Learning distinguishes between supervised and unsupervised learning. In supervised learning the input x and the target output z = f(x) are known. The function f(x) is unknown. Sometimes supervised learning is also called learning from examples. Examples for tasks which can be done by supervised learning are object recognition and handwritten digit recognition. Unlike supervised learning, in unsupervised learning only the observations X, which are gathered by an unknown process, are known. So no ground truth is given. An example for an unsupervised learning task is source separation [7]. The datasets used for any kind of Machine Learning are mort often split into training and test set (most common with ratio 2:1). To prevent overfitting (having a perfect model for the training data, but getting real bad results for test data) the training set is often split again into training and validation set (again the common ration is 2:1) [8].
Decision Trees
A decision tree is a simple form of Machine Learning. Also every human already used decision trees but they did not recognized it actively. The image shows a very simple decision tree deciding which animal (dog, cat, bird, fish) it is on different features (fur, barking, flying) [8].
simple decision tree [8]
Of course these trees can be much greater and more complex. The elements of a decision tree are [8]:
- nodes: these are so called feature test ("Does it have fur?", "Does it bark?", "Does it fly?")
- branches: outcomes of the feature test (mostly yes and no respectively true and false)
- leaves: region in input space and distribution of samples, the maximum of the samples is the decison (assuming the vector represents values for the different animals (dog,cat,bird,fish) then (1,0,0,0) would represent a dog, (0,1,0,0) a cat, (0,0,1,0) a bird and (0,0,0,1) a fish)
In the example, the leaves are represented in a human readable way.
Random Forests
Multiple decision trees can be trained on the same dataset if the split in training and test set is not always the same. This ensemble is the called a random forest. To get a decision from the forest, all trees within this forest decide on the given input and the result that is picked most often is the result of the forest [8].
Support Vector Machines
Support Vector Machines (SVM) transform the feature space into a higher dimension. In the new dimension exists a separating hyperplane that maximizes the distance between the classes [14].
Neural Networks
A neural network (NN) is a complex form of Machine Learning. It uses non-linear functions to process the input data. A simple neural network with exactly one hidden layer is also called multi-layered perceptron [9].
neural network with one hidden layer [9]
The input ones are necessary for every neural network but mostly left out of images. In the image you can see the input vector x distributed with weights to all the different nodes (number of nodes is length of input vector, non-linear basis functions like gaussians and sigmoids). The results of the nodes then calculate (again with weights) the network result vector y.
Deep Learning / Deep Neural Networks
If a neural network has more than one hidden layer as shown above, it is called a deep neural network. Theoretically it can have infinite layers, but one has to be aware of overfitting and calculational limits (CPU/GPU, memory).
deep neural network with two hidden layers [9]
Diagnosis
Radiogenomics
Radiogenomics is the recognition of genomic properties of a tumor by their appearance in images. They provide access to invaluable genetic information to predict tumor progress and response to drugs and other treatments. It uses a GPU-accelerated deep learning technique. This results in an earlier and more accurate diagnosis and treatment of tumors. But, half of the patients get potentially misleading results. The NN is also trained with CUDA on a range of NVIDIA GPUs [12]. The developing team also won the second prize (50.000$) of the NVIDIA's 2017 Global Impact Award [13].
MGMT Prediction
Glioblastoma multiforme (GBM) is the most common and deadliest type of brain tumor. It interferes with the DNA repair mechanism. It responds best to a mixture of radiation and chemotherapy [12,14]. Methylguanine methyltransferase (MGMT) is a key gene that encodes for a protein that repairs DNA [14]. The methylated version of MGMT suppresses DNA repair. The segmentation of MRI images is done with a random forest. Afterwards SVM and random forests are used to differentiate between MGMT methylated and unmethylated GBMs. The result shows that methylation is important to predict treatment response [14].
Bibliography
- https://pharmaphorum.com/news/watson-for-genomics-brain-tumour/
- https://www.projectbaseline.com/
- https://www.googlewatchblog.de/2017/04/project-baseline-google-schwester/
- https://thinkprogress.org/all-your-medical-data-in-the-cloud-not-so-fast-says-hhs-privacy-official-8cd5d20efa9a
- https://www.socpub.com/articles/keeping-your-health-records-cloud-good-idea-15169
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4394583/
- slides of the lecture Machine Learning I by Patrick van der Smagt (an introduction to Machine Learning), fall semester 2016/2017
- slides of the lecture Machine Learning I by Wiebke Köpp (Decision Trees), fall semester 2016/2017
- slides of the lecture Machine Learning I by Patrick van der Smagt (neural networks: introduction), fall semester 2016/2017
- Orringer, Daniel A., et al. "Rapid intraoperative histology of unprocessed surgical specimens via fibre-laser-based stimulated Raman scattering microscopy." Nature Biomedical Engineering 1 (2017): 0027
- https://blogs.nvidia.com/blog/2017/02/17/ai-helps-improve-brain-tumor-diagnosis/ . Posted on February 17, 2017 . Last Accessed: July 16, 2017.
- https://blogs.nvidia.com/blog/2017/04/17/ai-to-predict-brain-tumor-genomics/ . Posted on April 17, 2017 Last Accessed: July 16, 2017.
- https://blogs.nvidia.com/blog/2017/05/08/global-impact-awards/ . Posted on May 8, 2017
- Korfiatis, Panagiotis, et al. "MRI texture features as biomarkers to predict MGMT methylation status in glioblastomas." Medical physics 43.6 (2016): 2835-2844.
- http://fortune.com/2017/01/10/healthtap-dr-ai-launch/
- http://cdscoalition.org/wp-content/uploads/2017/04/CDS-3060-Guidelines-032717-with-memo.pdf
- http://www.fiercehealthcare.com/analytics/new-clinical-decision-support-software-guidelines-highlight-keys-to-self-regulation
- http://blogs.harvard.edu/billofhealth/2017/01/26/artificial-intelligence-medical-malpractice-and-the-end-of-defensive-medicine/
- https://qz.com/989137/when-a-robot-ai-doctor-misdiagnoses-you-whos-to-blame/
- http://www.digitalhealthdownload.com/2017/02/whats-deal-watson-artificial-intelligence-systems-medical-software-regulation-u-s-eu/
- http://www.europarl.europa.eu/news/en/press-room/20170210IPR61808/robots-and-artificial-intelligence-meps-call-for-eu-wide-liability-rules
- https://www.forbes.com/sites/bernardmarr/2017/01/20/first-fda-approval-for-clinical-cloud-based-deep-learning-in-healthcare/#386c9048161c