Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
17 results for
Filters
Models (17)
Blueprints (1)
Other (0)
Sort By
score:DESC
Best Match
Google
paligemma
Vision language model adept at comprehending text and visual inputs to produce informative responses
Model
image
+8
333K
1y
NVIDIA
retail-object-detection
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
Model
Object Detection
+7
811
1y
NVIDIA
visual-changenet
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
Model
image
+8
624
1y
NVIDIA
ocdrnet
OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.
Model
Optical Character Recognition
+7
757
1y
NVIDIA
nv-dinov2
NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.
Model
Image-to-Embedding
+4
1.13M
11mo
NVIDIA
nv-embed-v1
Generates high-quality numerical embeddings from text inputs.
Model
Non-Commercial Use Only
+2
1.48M
7mo
NVIDIA
nv-embedcode-7b-v1
The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.
Model
nemo retriever
+2
255K
9mo
NVIDIA
nv-embedqa-e5-v5
English text embedding model for question-answering retrieval.
Model
Embedding
+4
3.21M
7mo
NVIDIA
nv-grounding-dino
Grounding dino is an open vocabulary zero-shot object detection model.
Model
Object Detection
+3
3.59K
11mo
NVIDIA
parakeet-ctc-0.6b-zh-cn
Record-setting accuracy and performance for Mandarin English transcriptions.
Model
ASR
+4
7.79K
6mo
NVIDIA
nv-yolox-page-elements-v1
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Model
Object Detection
+6
15.91K
8mo
NVIDIA
llama-3.2-nv-embedqa-1b-v2
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
Model
nemo retriever
+3
6.82M
7mo
NVIDIA
llama-3.2-nv-rerankqa-1b-v2
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
Model
nemo retriever
+2
165K
7mo
NVIDIA
sparsedrive
End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.
Model
autonomous vehicles
+3
109
7mo
NVIDIA
streampetr
StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.
Model
autonomous vehicles
+3
296K
3mo
NVIDIA
maisi
MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.
Model
Image Generation
+2
755
11mo
NVIDIA
nvclip
NV-CLIP is a multimodal embeddings model for image and text.
Model
Computer vision
+3
19.66K
9mo
Items per page
24
1
1
of 1 pages