The world’s leading publication for data science, AI, and ML professionals.

AI in Action: Guiding the Discovery of New Antibiotics to Target Multidrug-Resistant Bacteria

Learn about the data science of AI applications to chemistry through my lay description of a new paper

Applications of AI to problems that matter

Discovering new Antibiotics to combat drug-resistant bacteria is a major urge as bacteria become resistant to existing ones, but it is an extremely challenging and costly endeavour. Any new way to come up with new antibiotics faster and more efficiently is thus very welcome.

A recent study published in Nature Chemical Biology demonstrated the power of machine learning (ML) in accelerating antibiotic discovery. The researchers used advanced algorithms to screen thousands of molecules and identified a promising compound called abaucin that specifically targets a pathogen called Acinetobacter baumannii which is nowadays resistant to a large number of antibiotics usually employed in hospitals to treat infections—what’s called a "multidrug-resistant pathogen". The new breakthrough in ML applied to antibiotic (search and) research highlights the potential of artificial intelligence in revolutionizing the field, promising a future of faster, more confident, and less expensive antibiotic development.

The biology and what this model discovered

Conventional screening approaches for discovering new antibiotics have been limited in their effectiveness against A. baumannii due to its multidrug resistance. As I’ve touched upon a few times, finding new drugs is a pressing matter:

New antibiotics are hard and expensive to develop, so scientists are trying these alternatives

We may finally have a new class of antibiotics

When microorganisms fight each other, we have a chance to discover novel antibiotics -here’s a new…

In the last couple of years, ML techniques started to provide novel and more efficient ways to explore chemical space via the use of message-passing networks, transformers, and diffusion models. Naturally, these new approaches can increase the chances of finding potent antibacterial molecules. In the study I present here, just published in Nature Chemical Biology, the researchers screened around 7,500 molecules to identify those that inhibited the growth of A. baumannii in laboratory tests and then built a predictive ML model with which they came up with the new prospective antibiotic, abaucin:

Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii – Nature Chemical…

Abaucin not only demonstrates promising characteristics as a potential antibiotic, but it also exhibits selective activity against A. baumannii, making it a narrow-spectrum antibiotic with minimal effects on other bacterial species. This specificity is crucial for minimizing disruption to the body’s natural microbiota (bacteria that normally live on our skin, guts, etc. and are essential to our well-being), which plays a vital role in human health. Even more promising regarding its actual use as antibiotic, the study reports that abaucin effectively controls A. baumannii infections in a mouse wound model, indicating its therapeutic potential.

Data science and ML behind the discovery

At the core of the work, a message-passing neural network was trained using a dataset of molecules capable (or not) of inhibiting the growth of A. baumannii. This dataset was itself measured and reported in the same work. Subsequently, the trained model was used to make predictions on the Drug Repurposing Hub, a comprehensive, annotated resource of FDA-approved compounds, whose purpose is to allow studies whereby already approved drugs are repurposed for new treatments. Here the focus was on identifying structurally new molecules with activity against A. baumannii. This process led to the discovery of abaucin as explained above.

The ML model utilized a graph representation of the molecules and iteratively exchanged information about chemical environments around atoms through message-passing steps. The learned features were combined with fixed molecular features computed using RDKit. The datasets used for training and prediction consisted of molecules screened for growth inhibition against A. baumannii, right within the same work.

Key in the model is how it converts the graph representing the structure of each molecule into a continuous vector representation by iteratively exchanging local chemical information between adjacent atoms and bonds in a series of message-passing steps. By accumulating the vector representations of various local chemical regions, the model obtained a comprehensive representation of the entire compound. To augment the learned features, fixed molecular features were computed using RDKit, one of the most important libraries out there for cheminformatics. The final vector, incorporating both learned and computed features, was then used as input for a feed-forward neural network trained to predict the antibacterial properties of the molecule, as a classifier.

The dataset used for training consisted in the results from a screen of 7,684 small molecules, evaluating their impact on the growth of A. baumannii. The screening experiments resulted in 480 molecules classified as ‘active’ and 7,204 molecules classified as ‘inactive.’ This dataset was utilized to train the above-described network as a binary classifier that predicted the activity of structurally new molecules. Additionally, the Drug Repurposing Hub, containing 6,680 molecules, was employed for further predictions using an ensemble of ten classifiers.

To encode the data, the authors used SMILES strings, which are textual representations of chemical structures, and then tools from the RDKit library to interpret these SMILES strings and derive the relevant atoms and bonds. This encoding allowed the neural network to process and learn from the molecular structures effectively.

The training process involved training the message-passing neural network model on the growth inhibition dataset using an ensemble of ten models. The hyperparameters employed included the number of message-passing steps (3), neural network hidden size (300), number of feed-forward layers (2), and dropout probability (0). The model’s performance was evaluated using tenfold cross-validation, a technique where the dataset is divided into ten subsets, and the model is trained and tested using different combinations of these subsets. The chemical relationship between molecules in the training and prediction datasets was measured using Tanimoto similarity, a score typically used to measure the similarity of two molecules -a whole topic in itself:

Chemical similarity – Wikipedia

ML’s impact on modern research and ultimately on our life

This study underscores the value of ML in relevant, modern scientific research, here in particular regarding biology and antibiotic discovery which are tightly related to clinical applications. By leveraging its ability to rapidly analyze vast chemical datasets, ML enables researchers to identify molecules with targeted antibacterial properties more efficiently. This approach not only accelerates the drug discovery process but also increases the likelihood of finding compounds effective against highly resistant bacteria like A. baumannii.

The success of machine learning in this study opens up exciting possibilities for the future of antibiotic research. With the continued development of advanced algorithms and computational models, scientists can optimize the process of identifying structurally diverse and functionally unique antibacterial leads. By harnessing the power of Artificial Intelligence, we may be one step closer to overcoming the global challenge of antibiotic resistance.

Further reads and related material

The article:

Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii – Nature Chemical…

RDKit, a library for cheminformatics widely used in this kind of research where software needs to parse and manipulate molecules:

RDKit

The drug repurposing hub, an open resource essential for research projects aimed at finding new uses for existing, already-approved molecules:

Drug Repurposing Hub

A few other cool applications of AI to Chemistry and neighboring fields of science:

After Beating Physics at Modeling Atoms and Molecules, Machine Learning Is Now Collaborating with…

The Era of Machine Learning for Protein Design, Summarized in Four Key Methods

ML goes after chemistry and material sciences -highlights of a review of interest to everybody…

New deep-learned tool designs novel proteins with high accuracy


www.lucianoabriata.com I write and photoshoot about everything that lies in my broad sphere of interests: nature, Science, technology, programming, etc.

Tip me here or become a Medium member to access all its stories (I get a small revenue without cost to you). Subscribe to get my new stories by email. Consult about small jobs on my services page here. You can contact me here.


Related Articles