## Model Overview ### Description: The MSA Search NIM is powered by GPU MMSeqs2. GPU MMSeqs2 is a GPU-accelerated toolkit for protein database search, Multiple Sequence Alignment (MSA), and Structural Template Search. While not a deep learning model, MMSeqs2 does require large protein databases for sequence similarity search and structural template discovery.
The container components are ready for commercial use.
### Third-Party Community Consideration This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case. ColabFold was developed by the authors of Mirdita *et al*. 2022. GPU MMSeqs2 was developed by the authors of Kallenborn *et al*. 2025. #### License / Terms of Use GOVERNING TERMS: **API Catalog:** The trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). **NIM Container:** The NIM container is governed by the [NVIDIA Software License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and [Product-Specific Terms](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/) for AI Products. ### Deployment Geography Global ### Use Case The MSA Search NIM enables researchers and commercial entities in the Drug Discovery, Life Sciences, and Digital Biology fields to rapidly generate multiple sequence alignments (MSA) and find structural templates from PDB databases. The output MSA and template structures can be used in downstream protein structure prediction and evolutionary analysis applications. ### Release Date #### 1.0.0 Build.nvidia.com March 16, 2025 via [build.nvidia.com/colabfold/msa-search](build.nvidia.com/colabfold/msa-search) NGC March 16, 2025 #### 2.0.0 NGC November 25, 2025 via [catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search](catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search) #### 2.1.0 NGC December 18, 2025 via [catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search](catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search) #### 2.2.0 NGC January 29, 2026 via [catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search](catalog.ngc.nvidia.com/orgs/nim/teams/colabfold/containers/msa-search) ### References: ``` @ARTICLE{jumper2021alphafold, title = "Highly accurate protein structure prediction with {AlphaFold}", author = "Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf and Tunyasuvunakool, Kathryn and Bates, Russ and {\v Z}{\'\i}dek, Augustin and Potapenko, Anna and Bridgland, Alex and Meyer, Clemens and Kohl, Simon A A and Ballard, Andrew J and Cowie, Andrew and Romera-Paredes, Bernardino and Nikolov, Stanislav and Jain, Rishub and Adler, Jonas and Back, Trevor and Petersen, Stig and Reiman, David and Clancy, Ellen and Zielinski, Michal and Steinegger, Martin and Pacholska, Michalina and Berghammer, Tamas and Bodenstein, Sebastian and Silver, David and Vinyals, Oriol and Senior, Andrew W and Kavukcuoglu, Koray and Kohli, Pushmeet and Hassabis, Demis", journal = "Nature", volume = 596, number = 7873, pages = "583--589", month = aug, year = 2021, language = "en", doi = {10.1038/s41586-021-03819-2}, } ``` ``` @ARTICLE{mirdita2022colabfold, title = "ColabFold: making protein folding accessible to all", author = "Mirdita, Milot and Sch{\"u}tze, Konstantin and Moriwaki, Yoshitaka and Heo, Lim and Ovchinnikov, Sergey and Steinegger, Martin", journal = "Nature Methods", volume = 19, number = 6, pages = "679--682", month = jun, year = 2022, language = "en", doi = {10.1038/s41592-022-01488-1}, } ``` ``` @ARTICLE{kallenborn2025gpu, title = "GPU-accelerated homology search with MMseqs2", author = "Kallenborn, Felix and Chacon, Alejandro and Hundt, Christian and Sirelkhatim, Hassan and Didi, Kieran and Cha, Sooyoung and Dallago, Christian and Mirdita, Milot and Schmidt, Bertil and Steinegger, Martin", journal = "bioRxiv", year = 2025, month = jan, day = 20, language = "en", doi = {10.1101/2024.11.13.623350}, } ```
### Model Architecture: **Architecture Type:** Not Applicable
**Network Architecture:** Not Applicable
### Input: **Input Type(s):** Protein Sequence, Databases, Structural Template Databases
**Input Format(s):** String (less than or equal to 4096 characters), Constrained List of Strings (one or more valid database names)
**Input Parameters:** String: 1D; Constrained List of Strings: 1D
**Other Properties Related to Input:** NA
### Output: **Output Type(s):** Multiple Sequence Alignment in A3M or FASTA format; Structural templates in mmCIF format
**Output Format:** A3M or FASTA (text file); mmCIF (text file)
**Output Parameters:** 1D
**Other Properties Related to Output:** N/A
### Software Integration: **Runtime Engine(s):** * Python, C++, CUDA
**Supported Hardware Microarchitecture Compatibility:**
* NVIDIA Ampere, NVIDIA Hopper, NVIDIA Ada Lovelace
**[Preferred/Supported] Operating System(s):**
* [Linux]
### Model Version(s): MSA NIM container downloads following NGC models: `nim/colabfold/msa-search:uniref30_2302-m18v1`
`nim/colabfold/msa-search:pdb_20251028_zip-m18v1`
`nim/colabfold/msa-search:pdb70_220313-m18v1`
`nim/colabfold/msa-search:pdb100_230517-m18v1`
`nim/colabfold/msa-search:colabfold_envdb_202108-m18v1`
## Training & Evaluation: Not Applicable. ### Training Dataset: **Link:** Not Applicable.
** Data Collection Method by dataset
* [Not Applicable]
** Labeling Method by dataset
* [Not Applicable]
**Properties:** Not Applicable. ### Evaluation Dataset: **Link:** Not Applicable.
** Data Collection Method by dataset
* [Not Applicable]
** Labeling Method by dataset
* [Not Applicable]
**Properties:** Not Applicable
### Inference: **Engine:** Python, C++, CUDA
**Test Hardware:**
* NVIDIA B200
* NVIDIA A6000 Ada
* NVIDIA A100
* NVIDIA L40
* NVIDIA H100
### Ethical Considerations: NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer team to ensure these software components meet requirements for the relevant industry and use case and address unforeseen product misuse. Please report quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/). **You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.**