PODCAST

AI, single cell genomics, and the new era of computational biology

Tali Raveh on the promise of AI-powered single-cell immunology

Jeremie Harris
Towards Data Science
3 min readFeb 2, 2022

--

APPLE | GOOGLE | SPOTIFY | OTHERS

Editor’s note: The TDS Podcast is hosted by Jeremie Harris, who is the co-founder of Mercurius, an AI safety startup. Every week, Jeremie chats with researchers and business leaders at the forefront of the field to unpack the most pressing questions around data science, machine learning, and AI.

Until very recently, the study of human disease involved looking at big things — like organs or macroscopic systems — and figuring out when and how they can stop working properly. But that’s all started to change: in recent decades, new techniques have allowed us to look at disease in a much more detailed way, by examining the behaviour and characteristics of single cells.

One class of those techniques now known as single-cell genomics — the study of gene expression and function at the level of single cells. Single-cell genomics is creating new, high-dimensional datasets consisting of tens of millions of cells whose gene expression profiles and other characteristics have been painstakingly measured. And these datasets are opening up exciting new opportunities for AI-powered drug discovery — opportunities that startups are now starting to tackle head-on.

Joining me for today’s episode is Tali Raveh, Senior Director of Computational Biology at Immunai, a startup that’s using single-cell level data to perform high resolution profiling of the immune system at industrial scale. Tali joined me to talk about what makes the immune system such an exciting frontier for modern medicine, and how single-cell data and AI might be poised to generate unprecedented breakthroughs in disease treatment on this episode of the TDS podcast.

Here were some of my favourite take-homes from the conversation:

  • While organs and other macroscopic structures matter a lot, the story of the immune system is largely the story of more than 50 different cell types that interact with the rest of the body — and with each other — in complex ways. Many diseases that present with similar symptoms may respond to different treatments if they stem from problems with different immune cell types. As a result, effective treatment of certain diseases requires a cell-level understanding of its underlying causes.
  • A confluence of different advances has made single-cell data accessible, from areas as diverse as microfluidics, optics and AI . The upshot is that we now can measure the expression levels of over 20,000 genes, the membrane protein compositions, and various other characteristics of indiviudual cells. That leads to sparse, high-dimensional datasets that are very amenable to ML analysis. (The sparsity arises because the majority of the 20,000 genes in the genome aren’t actually expressed by most cells.)
  • One particularly promising use case for these datasets is the study of drug interactions. Most of the data we have today about drug interactions is dubious: there are so many commonly prescribed drugs that studying interactions between more than two drugs at a time quickly leads to a combinatorial explosion of possibilities. The space of drug interactions is vast, which makes it particularly amenable to ML techniques that leverage compressed latent representations to make predictions.
  • Immunai is also working on predicting how different cell types will respond to treatments or drugs. To do so, they expose single cells to the drug or perturbation they’re interested in, and measure how it responds. But that process is expensive, so in practice the vast majority of the data they have isn’t labeled in this way. For that reason, they make heavy use of semi-supervised learning techniques, which can learn from both labeled and unlabeled data.
  • Tali cites transformer networks as one AI-related breakthrough that’s proven to be a game-changer when it comes to single-cell data analysis. More generally, the sparsity of labels and the volume of data combine to make auto-regressive techniques particularly interesting in this domain. Over the next 3 years, Tali is optimistic that techniques like transformers will lead to important breakthroughs both in understanding immune behaviour, and in discovering new, more effective drugs.

Chapters:

  • 0:00 Intro
  • 2:00 Tali’s background
  • 4:00 Immune systems and modern medicine
  • 14:40 Data collection technology
  • 19:00 Exposing cells to different drugs
  • 24:00 Labeled and unlabelled data
  • 27:30 Dataset status
  • 31:30 Recent algorithmic advances
  • 36:00 Cancer and immunology
  • 40:00 The next few years
  • 41:30 Wrap-up

--

--

Co-founder of Gladstone AI 🤖 an AI safety company. Author of Quantum Mechanics Made Me Do It (preorder: shorturl.at/jtMN0).