Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
By running the below commands, you accept the NVIDIA AI Enterprise Terms of Use and the NVIDIA Community Models License.
Pull and run nvidia/magpie-tts-multilingual
using Docker (this will download the full model and run it in your local environment)
$ docker login nvcr.io Username: $oauthtoken Password: <PASTE_API_KEY_HERE>
Refer Supported Models for full list of models.
export NGC_API_KEY=<PASTE_API_KEY_HERE> export CONTAINER_ID=magpie-tts-multilingual docker run -it --rm --name=$CONTAINER_ID \ --runtime=nvidia \ --gpus '"device=0"' \ --shm-size=8GB \ -e NGC_API_KEY=$NGC_API_KEY \ -e NIM_HTTP_API_PORT=9000 \ -e NIM_GRPC_API_PORT=50051 \ -p 9000:9000 \ -p 50051:50051 \ nvcr.io/nim/nvidia/$CONTAINER_ID:latest
It may take a up to 30 minutes depending on your network speed, for the container to be ready and start accepting requests from the time the docker container is started.
Open a new terminal and run following command to check if the service is ready to handle inference requests
curl -X 'GET' 'http://localhost:9000/v1/health/ready'
If the service is ready, you get a response similar to the following.
{"ready":true}
Install the Riva Python client package
sudo apt-get install python3-pip pip install nvidia-riva-client
Download Riva sample clients
git clone https://github.com/nvidia-riva/python-clients.git
Run Text to Speech inference
python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "Hello, this is a speech synthesizer." --language-code en-US --output output.wav
On running the above command, the synthesized audio file named output.wav will be created.
For more details on getting started with this NIM, visit the Riva TTS NIM Docs.