Generates a multiple sequence alignment from a query sequence and a protein sequence database search.
The MSA search NIM is powered by GPU MMSeqs2. GPU MMSeqs2 is a GPU-accelerated toolkit for protein database search and Multiple Sequence Alignment (MSA). While not a deep learning model, MMSeqs2 does require large protein databases for sequence similarity search.
This NIM is ready for commercial use.
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case. ColabFold was developed by the authors of Mirdita et al. 2022. GPU MMSeqs2 was developed by the authors of Kallenborn et al. 2025.
GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service.
You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.
Global
The MSA Search NIM enables researchers and commercial entities in the Drug Discovery, Life Sciences, and Digital Biology fields to rapidly generate multiple sequence alignments (MSA). The output MSA can be used in downstream protein structure prediction and evolutionary analysis applications.
Build.nvidia.com March 16, 2025 via build.nvidia.com/colabfold/msa-search
NGC March 16, 2025
@ARTICLE{jumper2021alphafold, title = "Highly accurate protein structure prediction with {AlphaFold}", author = "Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf and Tunyasuvunakool, Kathryn and Bates, Russ and {\v Z}{\'\i}dek, Augustin and Potapenko, Anna and Bridgland, Alex and Meyer, Clemens and Kohl, Simon A A and Ballard, Andrew J and Cowie, Andrew and Romera-Paredes, Bernardino and Nikolov, Stanislav and Jain, Rishub and Adler, Jonas and Back, Trevor and Petersen, Stig and Reiman, David and Clancy, Ellen and Zielinski, Michal and Steinegger, Martin and Pacholska, Michalina and Berghammer, Tamas and Bodenstein, Sebastian and Silver, David and Vinyals, Oriol and Senior, Andrew W and Kavukcuoglu, Koray and Kohli, Pushmeet and Hassabis, Demis", journal = "Nature", volume = 596, number = 7873, pages = "583--589", month = aug, year = 2021, language = "en", doi = {10.1038/s41586-021-03819-2}, }
@ARTICLE{mirdita2022colabfold, title = "ColabFold: making protein folding accessible to all", author = "Mirdita, Milot and Sch{\"u}tze, Konstantin and Moriwaki, Yoshitaka and Heo, Lim and Ovchinnikov, Sergey and Steinegger, Martin", journal = "Nature Methods", volume = 19, number = 6, pages = "679--682", month = jun, year = 2022, language = "en", doi = {10.1038/s41592-022-01488-1}, }
@ARTICLE{kallenborn2025gpu, title = "GPU-accelerated homology search with MMseqs2", author = "Kallenborn, Felix and Chacon, Alejandro and Hundt, Christian and Sirelkhatim, Hassan and Didi, Kieran and Cha, Sooyoung and Dallago, Christian and Mirdita, Milot and Schmidt, Bertil and Steinegger, Martin", journal = "bioRxiv", year = 2025, month = jan, day = 20, language = "en", doi = {10.1101/2024.11.13.623350}, }
Architecture Type: Not Applicable
Network Architecture: Not Applicable
Input Type(s): Protein Sequence, Databases
Input Format(s): String (less than or equal to 4096 characters), Constrained List of Strings (one or more valid database names)
Input Parameters: String: 1D; Constrained List of Strings: 1D
Other Properties Related to Input: NA
Output Type(s): Multiple Sequence Alignment in A3M or FASTA format
Output Format: A3M or FASTA (text file)
Output Parameters: 1D
Other Properties Related to Output: N/A
Runtime Engine(s):
Supported Hardware Microarchitecture Compatibility:
[Preferred/Supported] Operating System(s):
MMSeqs2 GPU 17-b804f
Uniref30_2302
colabfold_envdb_202108
PDB70_220313
Not Applicable.
Link: Not Applicable.
** Data Collection Method by dataset
** Labeling Method by dataset
Properties: Not Applicable.
Link: Not Applicable.
** Data Collection Method by dataset
** Labeling Method by dataset
Properties:
Not Applicable
Engine: Python, C++, CUDA
Test Hardware:
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.
You are responsible for ensuring that your use of NVIDIA AI Foundation Models complies with all applicable laws.