Multi-modal model to classify safety for input prompts as well output responses.
Multimodal question-answer retrieval representing user queries as text and documents as images.
Improve safety, security, and privacy of AI systems at build, deploy and run stages.
State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.
State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents
State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.
Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following
FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds
State-of-the-art, multilingual model tailored to all 24 official European Union languages.
Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses
High accuracy and optimized performance for transcription in 25 languages
FLUX.1 is a state-of-the-art suite of image generation models
Generalist model to generate future world state as videos from text and image prompts to create synthetic training data for robots and autonomous vehicles.
Simulate, test, and optimize physical AI and robotic fleets at scale in industrial digital twins before real-world deployment.
Route LLM requests to the best model for the task at hand.
Robust Speech Recognition via Large-Scale Weak Supervision.
Multi-lingual model supporting speech-to-text recognition and translation.
Multi-lingual model supporting speech-to-text recognition and translation.
State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.
Continuously extract, embed, and index multimodal data for fast, accurate semantic search. Built on world-class NeMo Retriever models, the RAG blueprint connects AI applications to multimodal enterprise data wherever it resides.
Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.
Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Record-setting accuracy and performance for English transcription.
State-of-the-art accuracy and speed for English transcriptions.
Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG.
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.
A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation