Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

5 results for

Filters

  • Free Endpoint
    2
  • Partner Endpoint
    3
  • Download Available
    3
  • Speech-to-Text
    1
  • Deep Infra
    3
  • CoreWeave
    2
  • Digital Ocean
    2
  • GMI Cloud
    2
  • Lightning AI
    2
  • OpenAI
    3
  • NVIDIA
    2
  • AI Engineer
    1
  • Application Developer
    1
  • Developer
    1
  • Platform Engineer
    1
  • AI And Machine Learning
    1
  • B200
    1
  • H100 80GB HBM3
    1
  • NeMoClaw
    1
  • OpenAI
    Downloadable

    whisper-large-v3

    Robust Speech Recognition via Large-Scale Weak Supervision.
    Model
    ASR
    Items per page
    of 1 pages
    144K
    1y
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-120b

    Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
    Model
    reasoning
    46.61M
    10mo
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-20b

    Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
    Model
    reasoning
    18.74M
    10mo
    DGX Spark
    30 MIN

    Run models with llama.cpp on DGX Spark

    Build llama.cpp with CUDA and serve models via an OpenAI-compatible API (Nemotron 3 Nano Omni as example)
    Playbook
    DGX Spark
    2mo

    Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server,
    Skill
    Developer
    141
    4d