---
title: "deepvariant"
publisher: "nvidia"
type: "endpoint"
updated: "2025-07-20T16:34:54.123Z"
description: "Run Google's DeepVariant optimized for GPU. Switch models for high accuracy on all major sequencers."
canonical: "https://build.nvidia.com/nvidia/parabricks-deepvariant"
---

# Model Overview

## Description

DeepVariant (the Parabricks tool behind the Universal Variant Calling Microservice) is a deep learning model that can help identify variants in short- and long-read sequencing datasets.  
This model is ready for commercial use.

DeepVariant works by taking aligned sequencing reads in BAM/CRAM format and utilizes a convolutional neural network (CNN) to classify the locus into true underlying genomic variation or sequencing error. DeepVariant can therefore call single nucleotide variants (SNVs) and insertions/deletions (InDels) from sequencing data at high accuracy in germline samples.

Parabricks DeepVariant is a highly optimized implementation of the DeepVariant pipeline that dramatically improves variant calling runtimes.

This model supports read sets from Illumina, Oxford Nanopore, and Pacific Biosciences natively; supports both whole-genome and whole-exome sequencing; and can output either Variant Call Format (VCF) or genomic VCF.

The Universal Variant Calling NIM can:
- Process short-read whole exome data
- Process short-read and long-read whole genome data
- Perform inference locally or on NVIDIA GPU Cloud
- Output VCF or gVCF.

## Third-Party Community Consideration 

This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to [GitHub](https://github.com/google/deepvariant?tab=readme-ov-file#license).

## References(s)

[Parabricks Latest Documentation](https://docs.nvidia.com/clara/parabricks/latest/index.html)

## Terms of use

By using this software or model, you are agreeing to the NVIDIA Parabricks [Terms of Use](https://docs.nvidia.com/clara/parabricks/latest/documentation/eula.html)

## Model Architecture

**Architecture Type:** Convolution Neural Network (CNN) <br>

**Network Architecture:** Inceptionv2 <br>

For more information, see [the Parabricks documentation](https://docs.nvidia.com/clara/parabricks/4.2.1/documentation/tooldocs/man_deepvariant.html#what-is-deepvariant).

## Input

**Input Type(s):** Indices (Text, Binary) <br>
**Input Format(s):** Tarball <br>
**Input Parameters:** One Dimensional (1D) <br>

- A reference genome tarball that contains a reference genome and the indices generated by `samtools` and `bwa`. This can be generated by running:

```bash
samtools faidx <reference genome>
bwa index <reference genome>
tar cvf <reference genome>.tar <reference genome>*
```

- A Binary Alignment Map (BAM) file from Parabricks fq2bam or Burrows-Wheeler Aligner.
- A BAM Index (BAI) file.

## Output

**Output Type(s):** Text (Sample, Manifest, Path, Path) <br>
**Output Format:** VCF File <br>
**Output Parameters:** 1D <br>

The output of the DeepVariant Microservice is the following:

- A VCF file containing variant calls for your sample.
- A VCF manifest (which contains the needed parts to sign a multipart-upload request if running in the cloud).
- A path to the STDOUT of the run (either locally or in cloud storage)
- A path to the STDERR of the run (either locally or in cloud storage)

## Software Integration

**Supported Hardware Platform(s):**
NVIDIA GPU(s) with at least 24 GB of RAM, including Hopper, Lovelace, Ampere, Turing, and Volta generations. <br>

**Supported Operating System(s):**
Linux <br>

## Model Version: 

*	V4.2.1-1  <br>

# Inference

**Engine:** [Triton and PyTriton](https://developer.nvidia.com/triton-inference-server) <br>
**Test Hardware:**  Other <br>

## Bias

Field                                                                                               |  Response
:---------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
Participation considerations from adversely impacted groups ([protected classes](https://www.senate.ca.gov/content/protected-classes)) in model design and testing:  |  None of the Above. Please see the [National Human Genome Research Institute's Guide on Human Genomic Variation](https://www.genome.gov/about-genomics/educational-resources/fact-sheets/human-genomic-variation).
Measures taken to mitigate against unwanted bias:                                                   |    None of the Above.

## Explainability

Field                                                                                                  |  Response
:------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------
Intended Applications & Domains:                                                                       |  Genomics
Types:                                                                                                 |  Convolutional Neural Network
Intended Users:                                                                                        |  Genomics Researchers and Population Genetics Scientists
Output:                                                                                                |  Variant calls in the [Variant Call Format](https://samtools.github.io/hts-specs/VCFv4.3.pdf)
Describe how the model works:                                                                          |  Reads are transformed into an image pileup structure. Variants are called based on the information in various channels in the image structure.
Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of:  |  Not Applicable
Technical Limitations:                                                                                 |  Requires 1 or more GPUs with 24 GB of VRAM
Verified to have met prescribed NVIDIA quality standards:                                                     |  Yes
Performance Metrics:                                                                                   |  Functional Equivalence to community DeepVariant; Runtime.
Potential Known Risks:                                                                                 |  None Known
Licensing:                                                                                             |  https://docs.nvidia.com/clara/parabricks/latest/documentation/eula.html

## Privacy

Field                                                                                                                              |  Response
:----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------
Generatable or reverse engineerable personally-identifiable information (PII)?                                                     |  Reverse Engineerable PII
Was consent obtained for any PII used?                                                                                             |  Yes, was performed by Google.
Protected class data used to create this model?                                                                                    |  None
How often is dataset reviewed?                                                                                                     |  Before Release
Is a mechanism in place to honor data subject right of access or deletion of personal data?                                        |  Unknown
If PII collected for the development of the model, was it collected directly by NVIDIA?                                            |  No
If PII collected for the development of the model by NVIDIA, do you maintain or have access to disclosures made to data subjects?  |  N/A
If PII collected for the development of this AI model, was it minimized to only what was required?                                 |  N/A
Is there provenance for all datasets used in training?                                                                                                                          |  Yes
Does data labeling (annotation, metadata) comply with privacy laws?                                                                |  Yes, where relevant
Is data compliant with data subject requests for data correction or removal, if such a request was made?                           |  No, not possible with externally-sourced data.

## Safety & Security

Field                                               |  Response
:---------------------------------------------------|:----------------------------------
Model Application(s):                               | Variant Calling of Aligned Sequencing Read Data
Describe the life-critical impacts (if present).   |  Should not be used for life-critical use cases per [The Parabricks End User License Agreement](https://docs.nvidia.com/clara/parabricks/latest/documentation/eula.html)
Use Case Restriction(s):                              |  Refer to [The Parabricks End User License Agreement](https://docs.nvidia.com/clara/parabricks/latest/documentation/eula.html)
Model and Dataset Restriction(s):                       |  The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development.