NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

92 results for

Filters

  • Download Available
    60
  • API Endpoint
    32
  • Launchable
    0
  • Enterprise
    0
  • Retrieval Augmented Generation
    13
  • Object Detection
    9
  • Speech-to-Text
    8
  • Synthetic Data Generation
    8
  • Text-to-Embedding
    8
  • NVIDIA
    89
  • Igenius
    1
  • Mistral AI
    1
  • OpenAI
    1
  • Iguazio
    0
  • NVIDIA AI
    0
  • NVIDIA Omniverse
    0
  • NVIDIA BioNemo
    0
  • NVIDIA Isaac GR00T
    0
  • NVIDIA

    nvidia-nemotron-nano-9b-v2

    High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
    Model
    thinking budget
    704K
    6mo
    NVIDIA

    cuopt

    World-record accuracy and performance for complex route optimization.
    Model
    Route Optimization
    1.43K
    9mo
    NVIDIA

    magpie-tts-flow

    Expressive and engaging text-to-speech, generated from a short audio sample.
    Model
    TTS
    833
    8mo
    NVIDIA

    magpie-tts-multilingual

    Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
    Model
    TTS
    33.92K
    8mo
    NVIDIA

    magpie-tts-zeroshot

    Expressive and engaging text-to-speech, generated from a short audio sample.
    Model
    TTS
    1.23K
    8mo
    NVIDIA

    parakeet-1.1b-rnnt-multilingual-asr

    High accuracy and optimized performance for transcription in 25 languages
    Model
    Automatic Speech Recognition
    35.82K
    10mo
    NVIDIA

    eyecontact

    Estimate gaze angles of a person in a video and redirect to make it frontal.
    Model
    telepresence
    1.61K
    11mo
    NVIDIA

    gliner-pii

    GLiNER PII detects Personally Identifiable Information in text.
    Model
    PII Detection
    76.04K
    5d
    NVIDIA

    maisi

    MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.
    Model
    Image Generation
    742
    11mo
    NVIDIA

    audio2face-3d

    Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
    Model
    Speech-to-Animation
    8mo
    NVIDIA

    megatron-1b-nmt

    Enable smooth global interactions in 36 languages.
    Model
    Text Translation
    11mo
    NVIDIA

    nv-dinov2

    NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.
    Model
    Image-to-Embedding
    1.07M
    11mo
    NVIDIA

    nv-grounding-dino

    Grounding dino is an open vocabulary zero-shot object detection model.
    Model
    Object Detection
    3.51K
    11mo
    NVIDIA

    parakeet-ctc-0.6b-es

    Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.
    Model
    ASR
    6mo
    NVIDIA

    parakeet-ctc-0.6b-vi

    Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.
    Model
    ASR
    712
    6mo
    NVIDIA

    parakeet-ctc-0.6b-zh-cn

    Record-setting accuracy and performance for Mandarin English transcriptions.
    Model
    ASR
    7.71K
    6mo
    NVIDIA

    parakeet-ctc-0.6b-zh-tw

    Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.
    Model
    ASR
    383
    4mo
    NVIDIA

    parakeet-ctc-1.1b-asr

    Record-setting accuracy and performance for English transcription.
    Model
    ASR
    34.17K
    8mo
    NVIDIA

    riva-translate-1.6b

    Enable smooth global interactions in 36 languages.
    Model
    Text Translation
    632K
    8mo
    NVIDIA

    riva-translate-4b-instruct-v1_1

    Translation model in 12 languages with few-shots example prompts capability.
    Model
    nvidia nim
    499K
    2mo
    NVIDIA

    canary-1b-asr

    Multi-lingual model supporting speech-to-text recognition and translation.
    Model
    Automatic Speech Recognition
    3.39K
    11mo
    NVIDIA

    parakeet-tdt-0.6b-v2

    Accurate and optimized English transcriptions with punctuation and word timestamps
    Model
    ASR
    2.98K
    7mo
    NVIDIA

    bevformer

    Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.
    Model
    autonomous vehicles
    97
    7mo
    NVIDIA

    llama-3.2-nemoretriever-1b-vlm-embed-v1

    Multimodal question-answer retrieval representing user queries as text and documents as images.
    Model
    nemo retriever
    269K
    8mo
    Items per page
    of 4 pages