Follow the steps below to download and run the NVIDIA NIM inference microservice for this model on your infrastructure of choice.
NGC_API_KEY
variable.export NGC_API_KEY=<PASTE_API_KEY_HERE>
docker run -it \ --gpus='"device=0"' \ -p 5000:5000 \ -e NGC_API_KEY \ nvcr.io/nvidia/cuopt/cuopt:25.02
This command will start the NIM container and expose port 5000 for the user to interact with the NIM.
{"status":"RUNNING","version":"25.02"}
before proceeding. This may take a couple of minutes. You can use the following command to query the health check.curl http://localhost:5000/v2/health/ready
nim_client.py
.#!/usr/bin/env python3 import requests import time data = {"cost_matrix_data": {"data": {"0": [[0,1],[1,0]]}}, "task_data": {"task_locations": [0,1]}, "fleet_data": {"vehicle_locations": [[0,0],[0,0]]}} response = requests.post( url="http://localhost:5000/cuopt/request", json=data ) response_body = response.json() poll_interval = 2 # Time in seconds between polls request_url = "http://localhost:5000/cuopt/request/" + response_body["reqId"] while True: response = requests.get(request_url) if response.status_code == 200 and "response" in response.json().keys(): print(response.json()) break elif response.status_code == 200: print(f"Polling for response") else: response.raise_for_status() time.sleep(poll_interval)
chmod +x nim_client.py ./nim_client.py
nim_client.sh
.#!/usr/bin/env bash set -e pip install jq URL=http://localhost:5000/cuopt/request data='{"cost_matrix_data": {"data": {"0": [[0,1],[1,0]]}}, "task_data": {"task_locations": [0,1]}, "fleet_data": {"vehicle_locations": [[0,0],[0,0]]}}' reqId=$(curl -H 'Content-Type: application/json' \ -d "$data" "$URL" | jq -r '.reqId' ) echo $reqId while true; do response=$(curl -s "$URL/$reqId") if echo "$response" | jq -e '.response.solver_response.status == 0' > /dev/null; then echo "$response" break fi echo "Polling for response" sleep 2 done
chmod +x nim_client.sh ./nim_client.sh
For more details on getting started with this NIM, visit the NVIDIA CUOPT Docs