Follow the steps below to download and run the NVIDIA NIM inference microservice for this model on your infrastructure of choice.
$ docker login nvcr.io Username: $oauthtoken Password: <PASTE_API_KEY_HERE>
Pull and run the NVIDIA NIM with the command below. This will download the optimized model for your infrastructure.
export NGC_API_KEY=<PASTE_API_KEY_HERE> export LOCAL_NIM_CACHE=~/.cache/nim mkdir -p "$LOCAL_NIM_CACHE" docker run -it --rm \ --gpus all \ -e NGC_API_KEY=$NGC_API_KEY \ -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \ -p 8000:8000 \ nvcr.io/nim/nvidia/nvclip:2.0.0
You can now make a local API call using this curl command:
curl -X POST 'http://0.0.0.0:8000/v1/embeddings' \ -H "Content-Type: application/json" \ -d '{ "input": ["The quick brown fox jumped over the lazy dog", "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAAEElEQVR4nGK6HcwNCAAA//8DTgE8HuxwEQAAAABJRU5ErkJggg==" ], "model": "nvidia/nvclip-vit-h-14", "encoding_format": "float" }'
For more details on getting started with this NIM, visit the NVIDIA NIM Docs.