End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.
The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
Rapidly identify and mitigate container security vulnerabilities with generative AI.
Grounding dino is an open vocabulary zero-shot object detection model.
Vision foundation model capable of performing diverse computer vision and vision language tasks.
Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.
Multilingual text reranking model.
English text embedding model for question-answering retrieval.
Multilingual text question-answering retrieval, transforming textual information into dense vector representations.
Generates high-quality numerical embeddings from text inputs.
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.