Try NVIDIA NIM APIs

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognition nemo retriever data ingestion table extraction supported language - english nvidia

microsoft phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

speech recognition visual qa language generation image-to-text chart and table understanding microsoft

arc evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

dna generation biology nim bionemo drug discovery arc

nvidia nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

llm security jailbreak detection prompt injection nvidia nim nvidia

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever chart element detection image-to-text university-at-buffalo

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection data ingestion chart detection nemo retriever table detection run-on-rtx extraction nvidia

baidu paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

optical character recognition table extraction optical character detection nemo retriever data ingestion run-on-rtx extraction baidu

hive deepfake-image-detection

Advanced AI model detects faces and identifies deep fake images.

computer vision ai safety deep fake detection content moderation hive

hive ai-generated-image-detection

Robust image classification model for detecting and managing AI-generated content.

image classification computer vision ai safety content moderation hive

nvidia nv-grounding-dino

Grounding dino is an open vocabulary zero-shot object detection model.

object detection computer vision deepstream nvidia nim nvidia

stabilityai stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

image generation text-to-image stabilityai

nvidia ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

optical character recognition image optical character detection cv vlm computer vision tao toolkit video nvidia

nvidia retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

object detection image cv vlm computer vision tao toolkit video nvidia nim nvidia

stabilityai stable-diffusion-xl

Generate images and stunning visuals with realistic aesthetics.

image generation text-to-image stabilityai

google deplot

Translate images of plots into tables with one-shot visual language understanding.

nemo retriever multimodal data ingestion image-to-text google

stabilityai stable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generation text-to-image stabilityai