
Nemotron OCR v2 is a state-of-the-art multilingual text recognition model designed for robust end-to-end optical character recognition (OCR) on complex real-world images.
Follow the steps below to download and run the NVIDIA NIM inference microservice for this model on your infrastructure of choice.
$ docker login nvcr.io
Username: $oauthtoken
Password: <PASTE_API_KEY_HERE>
Pull and run the NVIDIA NIM with the command below. This will download the optimized model for your infrastructure.
export NGC_API_KEY=<PASTE_API_KEY_HERE>
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p "$LOCAL_NIM_CACHE"
docker run -it --rm \
--gpus all \
--shm-size=16GB \
-e NGC_API_KEY=$NGC_API_KEY \
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
-u $(id -u) \
-p 8000:8000 \
nvcr.io/nim/nvidia/nemotron-ocr-v2:latest
You can now make a local API call using this curl command:
HOSTNAME="localhost"
SERVICE_PORT=8000
cat > payload.json <<EOF
{
"input": [
{
"type": "image_url",
"url": "data:image/png;base64,<BASE64_ENCODED_IMAGE>"
}
]
}
EOF
curl -X POST \
"http://${HOSTNAME}:${SERVICE_PORT}/v1/infer" \
-H "accept: application/json" \
-H "Content-Type: application/json" \
--data-binary @payload.json
For more details on getting started with this NIM, visit the NVIDIA NIM Docs.