The world’s leading publication for data science, AI, and ML professionals.

Machine Learning & AI Applications in Oncology

Recent advancements in oncology have led to exciting options for cancer treatment and long-term remission. However, efficient and…

Photo by National Cancer Institute on Unsplash
Photo by National Cancer Institute on Unsplash

Inside AI

An Introduction to Machine Learning in Oncology

On the Future of AI in Cancer Treatment and Research

Recent advancements in oncology have led to exciting options for cancer treatment and long-term remission. However, efficient and effective diagnosis remains an obstacle for timely treatment of most cancers. Favorable prognoses depend on early detection, but in many cases patients are unaware of their cancers until they suffer from related conditions. Pathologists need more powerful tools to augment their ability to quickly and correctly identify possible cancers.


The Potential and Challenges of AI Models

The development of Artificial Intelligence has the potential to revolutionize the processes of data collection, image processing, and subsequent diagnosis. Machine learning (ML) and deep learning are branches of AI that focus on identifying patterns in data and developing models to accurately predict outcomes for new information. The medical field presents unique challenges for ML models. Much of the data, especially from electronic health records (EHR), are uncategorized, sparse, or unavailable due to hospitals’ outdated systems or unwillingness to share data. Additionally, the implications of a poorly performing model are far-reaching and could be catastrophic for patients. Finally, the extreme complexity of cancer, and its possible treatments, lends itself to highly specific and isolated ML models. For example, a model used to detect cancer using the profile of circulating miRNAs suffered from low sensitivity and an inability to differentiate between benign and malignant tumors (2). This suggests that sharing models and insights could be difficult both between organizations and cancer types.

Machine learning models can use one of three styles for understanding inputted data. Supervised learning is the most straightforward style. Labeled inputs are fed to the model, which uses hidden layers of features and weights, to learn what transformations to the input correctly map to the labels. Because the inputs are pre-labeled the model’s accuracy can be measured and the model refined. Once trained, the model can be exposed to new data to make predictions. Unsupervised learning does not have these user-defined labels and the model must discover the features of the inputs in order to map to outputs. This method requires less human intervention, both a positive and a negative because the model may find hard to recognize patterns or it could use arbitrary features that have little real predictive power. Semi-supervised learning utilizes a subset of pre-labeled and unlabeled data. This is the closest method to how human pathologists decide on a diagnosis as they have access to labeled data, like tumor width or density, as well as unlabeled data, like previous records. Feature selection and extraction is a critical, human-labor intensive part of developing successful models and will be discussed in a later post.


From [5]: A schematic CNN model used to predict cancer survival from histological data. Under a CC BY-NC-ND 4.0 license
From [5]: A schematic CNN model used to predict Cancer survival from histological data. Under a CC BY-NC-ND 4.0 license

AI Applications in Biomedical Fields

The explosion of multi-OMIC data afforded by advancements in next generation sequencing (NGS) and high-throughput technologies (HTTs) have made genomic and proteomic data more available to researchers and clinicians (2). For example, molecular biomarkers have proven very effective for predicting cancer as certain markers, like PD-L1 and CTLA-4, are overexpressed in cancerous cells. This data can also help clinicians differentiate between subgroups of the same cancer and enable individualized treatment. If a CNN model could be trained to identify these markers from stained histological samples it could lead to regular, cheap, and effective screenings without intensive care from a clinician.

Clinical data, such as notes, medical imaging, and live physiological monitoring, lends itself well to analysis and prediction by ML models. This type of data needs heavy feature engineering to be successful as patient data is highly heterogeneous and not easily applied to new contexts. As Wang and Preininger surmise, deep learning models (DL) may provide the solution to this by creating its own features. This could allow for the development of more robust and accurate predictions and more individualized treatment (3). For example, a framework developed by De Fauw et al. calls for two models used in conjunction, where the first DL model analyzes and selects features that are then fed to the second convolutional neural network (CNN) that actually predicts likely diagnoses (4).

Patient outcome modeling has also found some success at estimating survivability and cancer remission. One such artificial neural network model reached 83% accuracy in predicting cancer survival in non-small cell lung cancer (NSCLC). However, as is the case with many of these models, it is not applicable to other cancer types. This is because several NSCLC-specific genes and their expression were used as features in the model. It may be difficult to develop general predictive models because of the sheer variety in cancer vectors and the usefulness of specific genes in predicting different cancers.


Key Takeaways and Next Steps

The future of AI in medical research and treatment is bright. While at present models are highly-specific; their successes in these areas should not be overlooked. As more data becomes available to researchers and clinicians’ familiarity and understanding of the possible applications of AI grows, we should expect a dramatic increase in their use.

That said, we must beware a sensationalized future for these models. While they are powerful, they are also limited in scope and should be viewed as another tool for clinicians to provide more effective, individualized treatment to patients. AI models are adept at identifying abnormalities from a homogenous selection and this ability should be leveraged preventatively to guide pathologists to potential issues.


Sources

[1] M.S. Copur, State of Cancer Research Around the Globe (2019), Oncology 33 (5): 181–185.

[2] K. Kourou, T.P. Exarchos, K.P. Exarchos, et al., Machine learning applications in cancer prognosis and prediction (2015), Computational and Structural Biotechnology Journal 13: 8–17.

[3] F. Wang and A. Preininger, AI in Health: State of the Art, Challenges, and Future Directions (2019), International Medical Informatics Association: Yearbook of Medical Informatics, 16–26.

[4] J. De Fauw, J.R. Ledsam, B. Romera-Paredes, et al., Clinically applicable deep learning for diagnosis and referral in retinal disease (2018), Nature Medicine 24 (9): 1342–1350.

[5] P. Modadersany, S. Yousefi, M. Amgad, et al., Predicting cancer outcomes from histology and genomics using convolutional networks (2018), PNAS 115 (13): 2970–2979.


Related Articles