Machine Learning Is The Future Of Cancer Prediction

Machine Learning models are getting better than pathologists at accurately predicting the development of cancer.

Published in

Towards Data Science

10 min readDec 8, 2018

Every year, Pathologists diagnose 14 million new patients with cancer around the world. That’s millions of people who’ll face years of uncertainty.

Pathologists have been performing cancer diagnoses and prognoses for decades. Most pathologists have a 96–98% success rate for diagnosing cancer. They’re pretty good at that part.

The problem comes in the next part. According to the Oslo University Hospital, the accuracy of prognoses is only 60% for pathologists. A prognosis is the part of a biopsy that comes after cancer has been diagnosed, it is predicting the development of the disease.

It’s time for the next step to be taken in pathology.

Introducing Machine Learning

—

The next step in pathology is Machine Learning.

—

Machine Learning (ML) is one of the core branches of Artificial Intelligence. It’s a system which takes in data, finds patterns, trains itself using the data and outputs an outcome.

So what makes a machine better than a trained professional?

ML has key advantages over Pathologists.

Firstly, machines can work much faster than humans. A biopsy usually takes a Pathologist 10 days. A computer can do thousands of biopsies in a matter of seconds.

Machines can do something which humans aren’t that good at. They can repeat themselves thousands of times without getting exhausted. After every iteration, the machine repeats the process to do it better. Humans do it too, we call it practice. While practice may make perfect, no amount of practice can put a human even close to the computational speed of a computer.

Another advantage is the great accuracy of machines. With the advent of the Internet of Things technology, there is so much data out in the world that humans can’t possibly go through it all. That’s where machines help us. They can do work faster than us and make accurate computations and find patterns in data. That’s why they’re called computers.

Brief Technical Explanation of Machine Learning

To begin, there are two broad categories of Machine Learning,

Supervised Learning
Unsupervised Learning

Supervised Learning is Fed Labeled Data

Supervised learning is perhaps best described by its own name. A supervised learning algorithm is an algorithm which is “taught” by the data it is given.

The model trains itself using labeled data and then tests itself. This is repeated until the optimal result is achieved. Once this is done, it can make predictions on future instances.

Unsupervised Learning Draws Conclusions from Unlabeled Data

In unsupervised learning data sets are not labeled. Instead, it’s the model’s job to create a structure that fits the data by finding patterns (such as groupings and clustering).

Think of unsupervised learning as a baby. Babies are born into this world without any knowledge of what’s “right” or “wrong” other than instincts. As they grow, they see, touch, hear and feel(input data) and try things out (test on the data) until they’ve learned about what it is.

Alright, you know the two main categories of ML. Cool. Now let’s dive a bit deeper into some of the techniques ML uses.

Regression Makes the Outcome More Accurate

Regression’s main goal is to minimize the cost function of the model.

What’s cost function?

The cost function is a function which calculates the distance between the hypothesis for the value x and the actual x value. Basically, it shows you how far off the outcome is from the actual answer.

The whole point of regression is to find a hyperplane (fancy word for multi-dimensional line) that minimizes the cost function to create the best possible relationship between data points.

Linear regression making the relationship more accurate

It starts with a random line with no correlation that reiterates using gradient descent to become the optimum relation.

Regression is done using an algorithm called Gradient Descent. In this algorithm, the cost function is reduced by the model adjusting its parameters.

Think of descent as you running down a hill, trying to get to the lowest point.

Meanwhile, as gradient descent reduces the cost function lower and lower, the outcome becomes more accurate too.

That’s how your model gets more accurate, by using regression to better fit the given data.

Classification Categorizes Data Points Into Groups

Supervised learning models can do more than just regression. One of ML’s most useful tasks is classification.

Classification algorithms make boundaries between data points classifying them as a certain group, depending on their characteristics matched against the model’s parameters.

In this model, data points are classified as either being sheep or goat. This is conditional on their steps per day depending on average daily temperature.

The boundary between the classes is created using a process called logistic regression.

An important fact to remember is that the boundary does not depend on the data.

Remember the cost function? Surprise! it’s also used in classification.

In classification, it is used similarly to regression to find the best possible fit to the data.

Supported Vector Machines

SVM’s are supervised learning algorithms used in both classification and regression.

The goal of an SVM algorithm is to classify data by creating a boundary with the widest possible margin between itself and the data.

Decision Trees Narrow Down to an Outcome

A Decision Tree is a tree-like model (if trees grew upside down) representation of probability and decision making in ML.

The process of deciding what you’ll be eating

As seen in the figure above, DT’s use conditional statements to narrow down on the probability of a certain value taking place for an instance. It uses the DT model to predict the probability of an instance having a certain outcome.

DT’s keep splitting into further nodes until every input has an outcome.

Basically, **internal nodes split further** while **external nodes are like a stop sign.**

Bayesian Networks Estimate Probability

BN is a classifier similar to a decision tree. The difference is, that BN classifiers show probability estimations rather than predictions.

The data set of variables and their conditional dependencies are shown in a visual form called a directed acyclic graph.

In the example above, the two reasons for grass being wet are either from rain or the sprinkler. Using a BN model, the probabilities of each scenario possible can be found.

Artificial Neural Networks Learn from the Data

ANN’s learn from the data its given. It gets its inspiration from our own neural systems, though they don’t quite work the same way.

ANN models are fed a lot of data in a layer we call the input layer. From this data, comparisons are made and the model automatically identifies characteristics of the data and labels it.

There are three layer types in ANN’s.

Input Layer
Hidden Layers
Output Layer

This is how an ANN works — First, every neuron in the input layer is given a value, called an activation function. Then, it is assigned a random weight, while the hidden layer neurons are assigned a random bias value. In the hidden layer, an algorithm called the activation function assigns a new weight for the hidden layer neuron, which is multiplied by a random bias value in the output layer.

The first model with random bias and weights. The network is essentially guessing at this point.

This activation function is multiplied by a random weight, which gets better with more iterations through a process called backpropagation.

Through this, the model develops a random prediction on its output on the given instance. Using back propagation, the ANN model adjusts its parameters to make the answer more accurate.

Machines think this cat is pretty adorable too.

For example, if a model was to classify cats from a large database of images, it would learn by recognizing edges that make up features like eyes and tails and eventually scale up to recognizing whole cats. Think of this process like building Lego. You identify different parts, put different sections together and finally put all the different sections together to make your masterpiece.

Back To Machine Learning Cancer Prognoses

Ok, so now you know a fair bit about machine learning.

Now, to the good part. You’ll now be learning about some of the models that have been developed for cancer biopsies and prognoses.

The model that predicts cancer susceptibility

This first model that I’ll show you was built to discriminate tumors as either malignant or benign among breast cancer patients.

In this model, ANN’s were used to complete the task. This model was built with a large number of hidden layers to better generalize data. Thousands of mammographic records were fed into the model so that it could learn to distinguish between benign and malignant tumors. Before being inputted, all the data was reviewed by radiologists.

An example of what a cancer prediction neural network’s inputs could be.

The model was largely successful, with an accuracy of AUC 0.965 (AUC, or area under the curve is a way of checking the success of a model). Though this model is accurate, the main advantage it has over pathologists is that it is more consistent, effective and less prone to error.

The model that predicts cancer recurrence

Alright, predicting cancer is neat. But predicting the recurrence of cancer is a way more complex task for humans. Luckily, machines are getting good at it. Let me explain how.

This model used a variety of ML techniques to learn how to predict the recurrence of oral cancer after the total remission of cancer patients. Clinical, imaging and genomic sources of data were collected from 86 patients for this model. Feature selection algorithms reduced the model’s features from above 110 to less than 30. This made the model more efficient and greatly reduced bias. The model tested using BN’s, ANN’s, SVM’s, DT’s and RF’s to classify patient data into those with cancer relapses and those without.

This BNN model predicts the recurrence of breast cancer.

In the end, the model correctly predicted all patients using feature selected data and BN’s. Even though this was a really accurate model, it had a really small dataset of only 86 patients.

In another similar study, researchers made an ML model that tested using SVM’s, ANN’s and regression to classify patients into low risk and high-risk groups for cancer recurrence. The SVM model outperformed the other two and had an accuracy rate of 84%. This was groundbreaking, as it was significantly more accurate than pathologists.

The model that predicts cancer survival rates

This model took in a dataset of 162,500 records and 16 key features. Using features such as the size of the tumor and the age of the patient, the model created a classification model for if the patient survived or not. The model was tested using SVM’s, ANN’s and semi-supervised learning (SSL: a mix between supervised and unsupervised learning). It found SSL’s to be the most successful with an accuracy rate of 71%.

Another study used ANN’s to predict the survival rate of patients suffering from lung cancer. It had an accuracy rate of 83%. This study is considered largely accurate, though it did not take into account other death-related factors such as blood clots.

What Does the Future of Cancer Prognosis Look Like?

AI is set to change the medical industry in the coming decades — it wouldn’t make sense for pathology to not be disrupted too.

Currently, ML models are still in the testing and experimentation phase for cancer prognoses. As datasets are getting larger and of higher quality, researchers are building increasingly accurate models.

Here’s what a future cancer biopsy might look like:
You perform clinical tests, either at a clinic or at home. Data is inputted into a pathological ML system. A few minutes later, you receive an email with a detailed report that has an accurate prediction about the development of your cancer.

While you might not see AI doing the job of a pathologist today, you can expect ML to replace your local pathologist in the coming decades, and it’s pretty exciting!

ML models still have a long way to go, most models still lack sufficient data and suffer from bias. Yet, something we are certain of is that ML is the next step of pathology, and it will disrupt the industry.

“There certainly will be job disruption. Because what’s going to happen is robots will be able to do everything better than us. … I mean all of us,” — Elon Musk

—

Key takeaways

Machine Learning is a branch of AI that uses numerous techniques to complete tasks, improving itself after every iteration.
Pathologists are accurate at diagnosing cancer but have an accuracy rate of only 60% when predicting the development of cancer.
Machine Learning is the next step forward for us to overcome this hurdle and create a high accuracy pathology system.

Thanks for reading! If you enjoyed this article:

Make sure to show support by sharing
Stay updated with me through Linkedin
Follow me on Medium for more articles like this!