Computational drug designers must pick a few chemical structures from around 10^60 options for experimental testing, more than the number of stars in the universe. To discover "hits" that have all the properties of a drug suitable for clinical testing, their search must be targeted and efficient.
This is a difficult problem, and pharma companies typically spend 10-15 years and $1B-$2B to bring a new drug to solve it and bring a new drug to market. In a typical drug discovery workflow, researchers first identify the biological target and mechanism that they want to alter to treat the disease, a process called target identification. Then, once a target is identified, molecules that bind to that target must be identified (hit identification). These hits are then optimized for safety and therapeutic effect.
The biology and chemistry underlying each of these steps is complex, often involving the identification of cryptic patterns in enormous datasets and long cycles of biological experimentation, chemical synthesis and validation. However, even though Pharma spent $262B USD on R&D in 2023 (Evaluate), 90% of drugs in clinical trials fail, demonstrating a need for innovative approaches to drug discovery (Nature Review Drug Discovery).
Here, we bring generative AI to bear on the problem, pre-optimizing molecules and screening their interaction with the target protein. This NIM Agent Blueprint shows how virtual screening can be recast using NVIDIA microservices for protein folding, molecule generation, and docking to speed the development cycle and produce better molecules, faster.
The user passes the sequence of the protein target that they want to design against to the AlphaFold2 NIM, which accurately determines that protein's structure. This step requires aligning the protein sequence to other known proteins, and multiple configurations for this alignment step are available.
An initial chemical structure is passed to the MolMIM NIM to seed its generative search through chemical space. The user can also choose a property to optimize for (e.g., QED), the number of molecules to generate, and other constraints. The generated molecules are scored and passed back to MolMIM for further optimization for multiple cycles depending on the number of iterations the user selects.
These molecular structures and the structure of the protein target are passed to the DiffDock NIM, which generates the number of binding poses that the user indicates, along with other constraints.
The user then clicks the "Generate Molecules" button, and when complete, optimized molecules are returned to the user, ready for further lab testing.
NVIDIA NIMTM Agent Blueprints are customizable AI workflow examples that equip enterprise developers with NIM microservices, reference code, documentation, and a Helm chart for deployment.
See a complete example of how to get started with this blueprint on the NVIDIA NIM Agent Blueprints GitHub repository
MolMIM is capable of optimizing molecules for user-defined objectives. In this example, MolMIM is employing oracles for QED (drug likeness), penalized log P (a measure of solubility), and similarity as measured by Tanimoto index. To employ MolMIM for optimization tasks on user-defined objectives, use the downloadable NIM and utilize the decoder endpoint. Learn more about how to generate molecules with user-defined oracle functions in MolMIM's documentation and example Jupyter notebooks.
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.
Use of the models in this Generative Virtual Screening Blueprint are governed by the NVIDIA AI Foundation Models Community License.
This blueprint shows how generative AI and accelerated NIM microservices can design optimized small molecules smarter and faster.