## Step 1. Generate API Key ::generate-api-key ## Step 2. Pull and Run the NIM ```bash $ docker login nvcr.io Username: $oauthtoken Password: ``` NVIDIA Maxine Studio Voice NIM uses gRPC APIs for inferencing requests. A NGC API KEY is required to download the appropriate models and resources when starting the NIM. If you are not familiar with how to create the `NGC_API_KEY` environment variable, the simplest way is to export it in your terminal: ```bash export NGC_API_KEY= ``` Run one of the following commands to make the key available at startup: ```bash # If using bash echo "export NGC_API_KEY=" >> ~/.bashrc # If using zsh echo "export NGC_API_KEY=" >> ~/.zshrc ``` Other, more secure options include saving the value in a file, so that you can retrieve with `cat $NGC_API_KEY_FILE`, or using a [password manager](https://www.passwordstore.org/). The following command launches the Maxine Studio Voice NIM container with the gRPC service. Find reference to runtime parameters for the container [here](https://docs.nvidia.com/nim/maxine/studio-voice/latest/getting-started.html#runtime-parameters-for-the-container). ```bash docker run -it --rm --name=maxine-studio-voice \ --runtime=nvidia \ --gpus all \ --shm-size=8GB \ -e NGC_API_KEY=$NGC_API_KEY \ -e NIM_MODEL_PROFILE= \ -e FILE_SIZE_LIMIT=36700160 \ -p 8000:8000 \ -p 8001:8001 \ nvcr.io/nim/nvidia/maxine-studio-voice:latest ``` Ensure you use the appropriate `NIM_MODEL_PROFILE` for your GPU. For more information about `NIM_MODEL_PROFILE`, refer to the the [NIM Model Profile Table](https://docs.nvidia.com/nim/maxine/studio-voice/latest/model-profile-table.html). Please note, the flag --gpus all is used to assign all available GPUs to the docker container. This fails on multiple GPU unless all GPUs are same. To assign specific GPU to the docker container (in case of different multiple GPUs available in your machine) use --gpus '"device=0,1,2..."' If the command runs successfully, you will get an output ending similar to the following: ```bash I1126 09:22:21.048202 31 grpc_server.cc:2558] "Started GRPCInferenceService at 127.0.0.1:9001" I1126 09:22:21.048377 31 http_server.cc:4704] "Started HTTPService at 127.0.0.1:9000" I1126 09:22:21.089295 31 http_server.cc:362] "Started Metrics Service at 127.0.0.1:9002" Maxine GRPC Service: Listening to 0.0.0.0:8001 ``` By default Maxine Studio Voice gRPC service is hosted on port `8001`. You will have to use this port for inferencing requests. ## Step 3. Test the NIM We have provided a sample client script file in our GitHub repo. The script could be used to invoke the Docker container using the following instructions. Download the Maxine Studio Voice Python client code by cloning the [NVIDIA Maxine NIM Clients Repository](https://github.com/NVIDIA-Maxine/nim-clients): ```bash git clone https://github.com/NVIDIA-Maxine/nim-clients.git cd nim-clients/studio-voice ``` Install the dependencies for the NVIDIA Maxine Studio Voice Python client: ```bash sudo apt-get install python3-pip pip install -r requirements.txt ``` Go to scripts directory ```bash cd scripts ``` Run the command to send gRPC request (By Default transactional mode) ```bash python studio_voice.py --target --input --output ``` For streaming mode: ```bash python studio_voice.py --target --input --output --streaming --model-type 48k-hq ``` When using `--streaming` mode, ensure the selected `--model-type` (48k-hq, 48k-ll, or 16k-hq) aligns with the `NIM_MODEL_PROFILE` Model Type configuration to maintain compatibility . To view details of command line arguments run this command ```bash python studio_voice.py -h ``` You will get a response similar to the following. ```bash usage: studio_voice.py [-h] [--ssl-mode {MTLS,TLS}] [--ssl-key SSL_KEY] [--ssl-cert SSL_CERT] [--ssl-root-cert SSL_ROOT_CERT] [--target TARGET] [--input INPUT] [--output OUTPUT] [--api-key API_KEY] [--function-id FUNCTION_ID] [--streaming] [--model-type {48k-hq,48k-ll,16k-hq}] Process wav audio files using gRPC and apply studio-voice. options: -h, --help show this help message and exit --preview-mode Flag to send request to preview NVCF server on https://build.nvidia.com/nvidia/studiovoice/api. --ssl-mode {MTLS,TLS} Flag to set SSL mode, default is None --ssl-key SSL_KEY The path to ssl private key. --ssl-cert SSL_CERT The path to ssl certificate chain. --ssl-root-cert SSL_ROOT_CERT The path to ssl root certificate. --target TARGET IP:port of gRPC service, when hosted locally. Use grpc.nvcf.nvidia.com:443 when hosted on NVCF. --input INPUT The path to the input audio file. --output OUTPUT The path for the output audio file. --api-key API_KEY NGC API key required for authentication, utilized when using TRY API ignored otherwise --function-id FUNCTION_ID NVCF function ID for the service, utilized when using TRY API ignored otherwise --streaming Flag to enable grpc streaming mode. --model-type {48k-hq,48k-ll,16k-hq} Studio Voice model type, default is 48k-hq. ``` For more details on getting started with this NIM including configuring using parameters, visit the [NVIDIA Maxine Studio Voice NIM Docs](https://docs.nvidia.com/nim/maxine/studio-voice/latest/index.html).