The role of artificial intelligence in drug discovery

Despite the buzz around artificial intelligence (AI), most industry insiders know that the use of machine learning (ML) in drug discovery is nothing new. For more than a decade, researchers have used computational techniques for many purposes, such as finding hits, modeling drug-protein interactions and predicting reaction rates.

Haydn Boehm

What is new is the hype. As AI has taken off in other industries, countless startups have emerged, promising to transform drug discovery and design with AI-based technologies for things like virtual screening, physics-based biological activity assessment and drug crystal-structure prediction.

Investors have made huge bets that these startups will succeed. Investment reached $13.8 billion in 2020. And more than one-third of large-pharma executives report using AI technologies.

While a few AI-native candidates are in clinical trials, around 90% remain in discovery or preclinical development, so it will take years to see if the bets pay off.

Artificial expectations

Along with big investments come high expectations — drug the undruggable, drastically shorten time lines, virtually eliminate wet lab work. Insider Intelligence projects that discovery costs could be reduced by as much as 70% with AI.

Unfortunately, it’s just not that easy. The complexity of human biology precludes AI from becoming a magic bullet. On top of this, data must be plentiful and clean enough to use. Models must be reliable. Prospective compounds need to be synthesizable. And drugs have to pass real-life safety and efficacy tests.

While this harsh reality hasn’t slowed investment, it has led to fewer companies receiving funding, to devaluations and to discontinuation of some more lofty programs, like IBM’s Watson AI for drug discovery.

This begs the question: Is AI for drug discovery more hype than hope? Absolutely not. Do we need to adjust our expectations and position for success? Absolutely, yes.

But how?

Three keys to implementing AI in drug discovery

Implementing AI in drug discovery requires reasonable expectations, clean data and collaboration. Let’s take a closer look.

Reasonable expectations — AI can be a valuable part of a company’s larger drug discovery program. But, for now, it’s best thought of as one option in a box of tools. Clarifying when, why, and how AI is used is crucial, albeit challenging.

Interestingly, investment has largely fallen to companies developing small molecules, which lend themselves to AI because they’re relatively simple compared to biologics, and also because there are decades of data upon which to build models. There is also great variance in the ease of applying AI across discovery, with models for early screening and physical-property prediction seemingly easier to implement than those for target prediction and toxicity assessment.

While the potential impact of AI is incredible, we should remember that good things take time. Pharmaceutical Technology recently asked its readers to project how long it might take for AI to reach its peak in drug discovery and, by far, the most common answer was “more than nine years.”

Clean data — “The main challenge to creating accurate and applicable AI models is that the available experimental data is heterogenous, noisy and sparse, so appropriate data curation and data collection is of the utmost importance.” — This quote from a 2021 Expert Opinion on Drug Discovery article speaks wonderfully to the importance of collecting clean data. While it refers to ADEMT and activity prediction models, the assertion also holds true in general. AI requires good data, and lots of it.

But good data are hard to come by. Publicly available data can be inadequate, forcing companies to rely on their own experimental data and domain knowledge. Unfortunately, many companies struggle to capture, federate, mine and prepare their data, perhaps due to skyrocketing data volumes, outdated software, incompatible lab systems or disconnected research teams. Success with AI will likely elude these companies until they implement technology and workflow processes that let them:

– Facilitate error-free data capture without relying on manual processing.

– Handle the volume and variety of data produced by different teams and partners.

– Ensure data integrity and standardize data for model readiness.

Collaboration — Companies hoping to leverage AI need a full view of all their data, not just bits and pieces. This demands a research infrastructure that lets computational and experimental teams collaborate, uniting workflows and sharing data across domains and locations. Careful process and methodology standardization is also needed to ensure that results obtained with the help of AI are repeatable.

Beyond collaboration within organizations, key industry players are also collaborating to help AI reach its full potential, making security and confidentiality key concerns. For example, many large pharmas have partnered with startups to help drive their AI efforts. Collaborative initiatives, such as the MELLODDY Project, have formed to help companies leverage pooled data to improve AI models.

Haydn Boehm is director of product marketing at Dotmatics, a leader in R&D scientific software connecting science, data and decision making.