NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

5 results for

Filters (1)

  • Free Endpoint
    0
  • Partner Endpoint
    0
  • Download Available
    0
  • Launchable
    0
  • Code Generation
    0
  • Image-to-Text
    0
  • Retrieval Augmented Generation
    0
  • Digital Twin
    0
  • Synthetic Data Generation
    0
  • Deep Infra
    0
  • Together AI
    0
  • Bitdeer AI
    0
  • GMI Cloud
    0
  • CoreWeave
    0
  • NVIDIA
    5
  • Mistral AI
    0
  • Qwen
    0
  • DeepSeek AI
    0
  • Meta
    0
  • NVIDIA AI
    0
  • Inference
  • DGX Station
    20 MIN

    Serve Qwen3-235B with vLLM

    Set up vLLM server with Qwen3-235B on DGX Station
    Playbook
    vLLM
    1mo
    Items per page
    of 1 pages
    DGX Spark
    30 MIN

    LM Studio on DGX Spark

    Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.
    Playbook
    Inference
    2mo
    DGX Spark
    30 MIN

    Nemotron-3-Nano with llama.cpp

    Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark
    Playbook
    Nemotron
    4mo
    DGX Spark
    30 MIN

    Run models with llama.cpp on DGX Spark

    Build llama.cpp with CUDA and serve models via an OpenAI-compatible API (Nemotron 3 Nano Omni as example)
    Playbook
    DGX Spark
    3w
    DGX Spark
    60 MIN

    cuTile Kernels

    Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300
    Playbook
    FMHA
    1d