NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

openfold

openfold2

Run Anywhere

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

BiologyBionemoProtein FoldingnimDrug Discovery
Get API Key
API Reference
Accelerated by DGX Cloud

Model Overview

Description:

OpenFold2 is a protein structure prediction model from the OpenFold Consortium and the Alquraishi Laboratory. OpenFold2 is a pytorch re-implementation of Google Deepmind's AlphaFold2, with support for both training and inference. OpenFold2 demonstrates accuracy parity with AlphaFold2, and improved speed. For more information, please visit the OpenFold repository see the OpenFold repository https://github.com/aqlaboratory/openfold.

This model is available for commercial use.

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case.

License/Terms of Use:

GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service; and the use of this model is governed by the NVIDIA Community Model License. ADDITIONAL INFORMATION: Apache 2.0 License.

You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.

Deployment Geography:

Global

Use Case

The OpenFold2 NIM can be used at academic and pharmaceutical industry research labs. The structure prediction functionality supports computer-aided drug design.

Release Date

  • build.nvidia.com: November 28, 2025 at build.nvidia.com/openfold/openfold2
  • NGC: November 28, 2025 at https://registry.ngc.nvidia.com

References:

@article {Ahdritz2022.11.20.517210,
	author = {Ahdritz, Gustaf and Bouatta, Nazim and Floristean, Christina and Kadyan, Sachin and Xia, Qinghui and Gerecke, William and O{\textquoteright}Donnell, Timothy J and Berenberg, Daniel and Fisk, Ian and Zanichelli, Niccolò and Zhang, Bo and Nowaczynski, Arkadiusz and Wang, Bei and Stepniewska-Dziubinska, Marta M and Zhang, Shang and Ojewole, Adegoke and Guney, Murat Efe and Biderman, Stella and Watkins, Andrew M and Ra, Stephen and Lorenzo, Pablo Ribalta and Nivon, Lucas and Weitzner, Brian and Ban, Yih-En Andrew and Sorger, Peter K and Mostaque, Emad and Zhang, Zhao and Bonneau, Richard and AlQuraishi, Mohammed},
	title = {{O}pen{F}old: {R}etraining {A}lpha{F}old2 yields new insights into its learning mechanisms and capacity for generalization},
	elocation-id = {2022.11.20.517210},
	year = {2022},
	doi = {10.1101/2022.11.20.517210},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/10.1101/2022.11.20.517210},
	eprint = {https://www.biorxiv.org/content/early/2022/11/22/2022.11.20.517210.full.pdf},
	journal = {bioRxiv}
}
@ARTICLE{jumper2021alphafold,
    title    = "Highly accurate protein structure prediction with {AlphaFold}",
    author   = "Jumper, John and Evans, Richard and Pritzel, Alexander and Green,
                Tim and Figurnov, Michael and Ronneberger, Olaf and
                Tunyasuvunakool, Kathryn and Bates, Russ and {\v Z}{\'\i}dek,
                Augustin and Potapenko, Anna and Bridgland, Alex and Meyer,
                Clemens and Kohl, Simon A A and Ballard, Andrew J and Cowie,
                Andrew and Romera-Paredes, Bernardino and Nikolov, Stanislav and
                Jain, Rishub and Adler, Jonas and Back, Trevor and Petersen, Stig
                and Reiman, David and Clancy, Ellen and Zielinski, Michal and
                Steinegger, Martin and Pacholska, Michalina and Berghammer, Tamas
                and Bodenstein, Sebastian and Silver, David and Vinyals, Oriol
                and Senior, Andrew W and Kavukcuoglu, Koray and Kohli, Pushmeet
                and Hassabis, Demis",
    journal  = "Nature",
    volume   =  596,
    number   =  7873,
    pages    = "583--589",
    month    =  aug,
    year     =  2021,
    language = "en",
    doi = {10.1038/s41586-021-03819-2},
}

Model Architecture:

Architecture Type: Protein Structure Prediction
Network Architecture: AlphaFold2

Input:

Input Type(s): Protein Sequence, Multiple Sequence Alignments, Templates
Input Format(s): String (less than or equal to 1000), a3m-format strings, hhr-format strings
Input Parameters: One-Dimensional (1D), One-Dimensional (1D), One-Dimensional (1D)
Other Properties Related to Input: a3m is a standard file format for storing multiple sequence alignment results. hhr is the file format output by the tool hh-search. For more informaiton see hh-suite

Output:

Output Type(s): Protein Structure(s) in PDB Format
Output Format: PDB (text file)
Output Parameters: 1D
Other Properties Related to Output: Pose (num_atm_ x 3)

Software Integration:

Runtime Engine(s):

  • PyTorch
  • TensorRT-BioNeMo

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Hopper
  • NVIDIA Ampere
  • NVIDIA Lovelace
  • NVIDIA Blackwell

[Preferred/Supported] Operating System(s):

  • Linux

Model Version(s):

  • AlphaFold weights 2.3.2
  • OpenFold 2.1.0 (pl_upgrade)

Training & Evaluation:

Training Dataset:

Link: Highly Accurate ... Data Availability

The model parameter sets were trained by Google Deepmind as part of AlphaFold2 development. A description of the training dataset and relevant download links are available at Highly Accurate ... Data Availability. This data was not collected by NVIDIA.

** Data Collection Method by dataset

  • Hybrid: Automatic/Sensors, Human
  • See the description at Highly Accurate ... Data Availability.

** Labeling Method by dataset

  • Hybrid: Automatic/Sensors, Human
  • See the description at Highly Accurate ... Data Availability.

Properties (Quantity, Dataset Descriptions, Sensor(s)): Uniclust dataset of 355,993 sequences with the full MSAs. These predictions were then used to train a final model with identical hyperparameters, except for sampling examples 75% of the time from the Uniclust prediction set, with sub-sampled MSAs, and 25% of the time from the clustered PDB set.

Evaluation Dataset:

Link: See the description at Highly Accurate... Sec10.

** Data Collection Method by dataset

  • Hybrid: Automatic/Sensors, Human
  • See the description at Highly Accurate ... Data Availability.

** Labeling Method by dataset

  • Hybrid: Automatic/Sensors, Human
  • See the description at Highly Accurate ... Data Availability.

Properties (Quantity, Dataset Descriptions, Sensor(s)): Uniclust dataset of 355,993 sequences with the full MSAs. These predictions were then used to train a final model with identical hyperparameters, except for sampling examples 75% of the time from the Uniclust prediction set, with sub-sampled MSAs, and 25% of the time from the clustered PDB set.

Inference:

Engine: PyTorch
Test Hardware:

  • NVIDIA H100
  • NVIDIA A100
  • NVIDIA L40s
  • NVIDIA RTX6000-ADA
  • NVIDIA B200

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.

Get Help

Enterprise Support

Get access to knowledge base articles and support cases or submit a ticket.