NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

nvidia

nemotron-graphic-elements-v1

Downloadable

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Chart DetectionObject DetectionTable Detectiondata ingestionnemo retriever
Get API Key
API ReferenceAPI Reference
Accelerated by DGX Cloud

nemotron-graphic-elements-v1

Description

nemotron-graphic-elements-v1 is a specialized object detection model designed to identify and extract key elements from charts and graphs. Based on YOLOX (an anchor-free version of YOLO), it detects and localizes graphic elements including titles, axis labels, legends, and data point annotations. While the underlying approach builds upon work from the YOLOX ecosystem, NVIDIA developed its own base model through complete retraining rather than using pre-trained weights.

This model supersedes the CACHED model.

This model is ready for commercial/non-commercial use.

License and Terms of Use:

GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Open Model License Agreement.

You are responsible for ensuring that your use of NVIDIA provided models complies with all applicable laws.

Model Developer: NVIDIA

Deployment Geography:

Global

Use Case:

This model is designed for automating extraction of graphic elements in enterprise documents, including:

  • Enterprise document extraction, embedding, and indexing
  • Augmenting Retrieval Augmented Generation (RAG) workflows with multimodal retrieval
  • Data extraction from legacy documents and reports

Release Date:

Build.NVIDIA.com 03/02/2026 via nemotron-graphic-elements-v1

Reference(s):

References:

  • YOLOX Repository
  • YOLOX Paper

Model Architecture:

Architecture Type: YOLOX
Network Architecture: DarkNet53 Backbone + FPN decoupled head (one 1x1 convolution + 2 parallel 3x3 convolutions: one for classification and one for bounding box prediction)
Classes Detected: chart_title, x_title, y_title, xlabel, ylabel, other, legend_label, legend_title, mark_label, value_label
Number of Model Parameters: 5.4e7

Input:

Input Types: Image
Input Formats: RGB
Input Parameters: Two Dimensional (2D)
Other Input Properties: Expects an image array (single image or batch). Expected input is a np.ndarray image of shape [Channel, Width, Height], or an np.ndarray batch of images of shape [Batch, Channel, Width, Height].

Output:

Output Types: Structured detections (bounding boxes + labels + confidence)
Output Format: Dict / JSON-compatible structure
Output Parameters: One Dimensional (1D)
Other Output Properties: Outputs detections per input image.

Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.

Software Integration:

Runtime Engines: TensorRT
Supported Hardware Microarchitecture Compatibility:
NVIDIA Ampere
NVIDIA Hopper
NVIDIA Lovelace
Operating Systems: Linux

Inference

Acceleration Engine: TensorRT
Test Hardware: Tested on supported hardware in the compatibility section

Model Version(s)

nemotron-graphic-elements-v1

Short Name: nemotron-graphic-elements-v1

Training and Evaluation Datasets:

Training Dataset

Data Modality: Image
Training Data Collection: Hybrid (Automated + Human)
Training Labeling: Hybrid (Automated + Human)
Training Properties: Trained using a mixture of real-world chart images and pseudo-labeled charts, including:

  • PubMed Central (PMC) Chart Dataset (link): A real-world dataset collected from PubMed Central documents and manually annotated (ICPR 2022 CHART-Infographic competition). Includes 5,614 images for chart element detection, 4,293 images for plot detection/data extraction, and 22,924 images for chart classification.
  • DeepRule dataset (link): A large dataset of chart images crawled from public Excel sheets with text overwritten to protect privacy. Relevant classes were pseudo-labeled using the CACHED model; training used a subsample of 9,091 charts where a title was detected, alongside the PMC training images.

Evaluation Dataset

Evaluation Data Collection: Hybrid (Automated + Human)
Evaluation Labeling: Hybrid (Automated + Human)
Evaluation Properties: Evaluated using the PMC Chart dataset. Mean Average Precision (mAP) was used as the evaluation metric. The validation dataset is the same as the PMC Chart dataset.

Number of bounding boxes and images per class:

LabelImagesBoxes
chart_title3838
legend_label3181077
legend_title1719
mark_label42219
other113464
value_label52726
x_title404437
xlabel5534091
y_title502505
ylabel5343944
Total56011520

Per-class Performance Metrics

Average Precision (AP)

ClassAPClassAPClassAP
chart_title82.38x_title88.77y_title89.48
xlabel85.04ylabel86.22other55.14
legend_label84.09legend_title60.61mark_label49.31
value_label62.66

Average Recall (AR)

ClassARClassARClassAR
chart_title93.16x_title92.31y_title92.32
xlabel88.93ylabel89.40other79.48
legend_label88.07legend_title68.42mark_label73.61
value_label68.32

The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure this model meets requirements for the relevant industry and use case, and address unforeseen product misuse.

For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns here.