
OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)
OpenFold3 is a biomolecular complex structure prediction model from the OpenFold Consortium and the Alquraishi Laboratory. OpenFold3 is a pytorch re-implementation of Google Deepmind's AlphaFold3, with support for both training and inference. See the github repo https://github.com/aqlaboratory/openfold-3.
This model is available for commercial use.
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case.
GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Open Model License. ADDITIONAL INFORMATION: Apache 2.0 License.
You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.
Global
The OpenFold3 NIM can be used at academic and pharmaceutical industry research labs. The structure prediction functionality supports computer-aided drug design.
@article{Abramson2024,
author = {Abramson, Josh and Adler, Jonas and Dunger, Jack and Evans, Richard and Green, Tim and Pritzel, Alexander and Ronneberger, Olaf and Willmore, Lindsay and Ballard, Andrew J. and Bambrick, Joshua and Bodenstein, Sebastian W. and Evans, David A. and Hung, Chia-Chun and O’Neill, Michael and Reiman, David and Tunyasuvunakool, Kathryn and Wu, Zachary and Žemgulytė, Akvilė and Arvaniti, Eirini and Beattie, Charles and Bertolli, Ottavia and Bridgland, Alex and Cherepanov, Alexey and Congreve, Miles and Cowen-Rivers, Alexander I. and Cowie, Andrew and Figurnov, Michael and Fuchs, Fabian B. and Gladman, Hannah and Jain, Rishub and Khan, Yousuf A. and Low, Caroline M. R. and Perlin, Kuba and Potapenko, Anna and Savy, Pascal and Singh, Sukhdeep and Stecula, Adrian and Thillaisundaram, Ashok and Tong, Catherine and Yakneen, Sergei and Zhong, Ellen D. and Zielinski, Michal and Žídek, Augustin and Bapst, Victor and Kohli, Pushmeet and Jaderberg, Max and Hassabis, Demis and Jumper, John M.},
journal = {Nature},
title = {Accurate structure prediction of biomolecular interactions with AlphaFold 3},
year = {2024},
volume = {630},
number = {8016},
pages = {493–-500},
doi = {10.1038/s41586-024-07487-w}
}
Architecture Type: Protein Structure Prediction
Network Architecture: AlphaFold3
** This model was developed based on AlphaFold3
** Number of model parameters: 3.68×10⁸
Input Type(s): Protein Sequence, Multiple Sequence Alignments; DNA Sequence; RNA Sequence; Ligand CCD code; Ligand SMILES code
Input Format(s): String (less than or equal to 1000), a3m-format strings, csv-format string, string
Input Parameters: One-Dimensional (1D), One-Dimensional (1D), One-Dimensional (1D); One-Dimensional (1D); One-Dimensional (1D);One-Dimensional (1D); One-Dimensional (1D)
Other Properties Related to Input: a3m is a standard file format for storing multiple sequence alignment results. a3m-format strings, csv-format string is a standard format for atomic structures
Output Type(s): Biomolecular Complex Structure(s) in mmCIF format
Output Format: mmCIF/PDB (text)
Output Parameters: 1D
Other Properties Related to Output: Pose (num_atm_ x 3)
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
Runtime Engine(s):
Supported Hardware Microarchitecture Compatibility:
[Preferred/Supported] Operating System(s):
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
Link: Accurate structure prediction of biomolecular interactions with AlphaFold 3
The model weights trained by the OpenFold Consortium followed the procedure in Accurate structure prediction of biomolecular interactions with AlphaFold 3. This data was not collected by NVIDIA.
** Data Collection Method by dataset
** Labeling Method by dataset
Properties (Quantity, Dataset Descriptions, Sensor(s)):
During training, each learning example is composed of the input and the target, where the target is a crop (a piece) of the 3D structure of a biomolecular complex. The biomolecular complex structure is either (a) experimentally determined, or (b) model-predicted. The input is composed of the crop-restricted portions of (i) ligand identifiers and protein, DNA, and RNA sequences, (ii) protein and RNA multiple sequence alignments, and (iii) protein structural templates.
The experimental complex structures are sourced from the PDB, and are organized into a
The predicted complex structures are organized into
In total, these datasets are composed of ~13 million complexes. Throughout the training process a total of ~20 million sample complex crops are drawn from these datasets, using probability weights described at Accurate structure prediction of biomolecular interactions with AlphaFold 3, Supplementary Information.
All of the experimental complex structures in the Weighted PDB Dataset were deposited before 2021-09-30.
For details on the computation of protein and RNA multiple sequence alignments, and protein structure templates, these methods follow Accurate structure prediction of biomolecular interactions with AlphaFold 3, Supplementary Information
Link: See the description at Accurate structure prediction of biomolecular interactions with AlphaFold 3.
** Data Collection Method by dataset
** Labeling Method by dataset
Properties (Quantity, Dataset Descriptions, Sensor(s)):
For evaluation (post-validation benchmarks), each learning example has an input and target, similar to the Training Dataset, but not restricted to a crop. Every complex in the Evaluation Dataset has structure determined after 2021-09-30.
For details on the computation of protein and RNA multiple sequence alignments, and protein structure templates, these methods follow Accurate structure prediction of biomolecular interactions with AlphaFold 3, Supplementary Information
Engine: TRT, PyTorch
Test Hardware:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.
Get access to knowledge base articles and support cases or submit a ticket.