nvidia/fastpitch-hifigan-tts

RUN ANYWHERE

Expressive and engaging English voices for Q&A assistants, brand ambassadors, and service robots

By running the below commands, you accept the NVIDIA AI Enterprise Terms of Use and the NVIDIA Community Models License.

Pull and run nvidia/fastpitch-hifigan-tts using Docker (this will download the full model and run it in your local environment)

$ docker login nvcr.io Username: $oauthtoken Password: <PASTE_API_KEY_HERE>

Pull and run the NVIDIA NIM with the command below.

This command launches NIM container with the generic (non-optimized) model on any of the supported GPUs. GPU specific optimized models are available for select GPUs. For using optimized models, refer the Supported Models and specify NIM_MANIFEST_PROFILE according to your GPU in the Docker run command below.

export NGC_API_KEY=<PASTE_API_KEY_HERE> export CONTAINER_NAME=fastpitch-hifigan-tts docker run -it --rm --name=$CONTAINER_NAME \ --runtime=nvidia \ --gpus '"device=0"' \ --shm-size=8GB \ -e NGC_API_KEY=$NGC_API_KEY \ -e NIM_MANIFEST_PROFILE=3c8ee3ee-477f-11ef-aa12-1b4e6406fad5 \ -e NIM_HTTP_API_PORT=9000 \ -e NIM_GRPC_API_PORT=50051 \ -p 9000:9000 \ -p 50051:50051 \ nvcr.io/nim/nvidia/fastpitch-hifigan-tts:1.0.0
It may take a up to 30 minutes depending on your network speed, for the container to be ready and start accepting requests from the time the docker container is started.

Open a new terminal and run following command to check if the service is ready to handle inference requests

curl -X 'GET' 'http://localhost:9000/v1/health/ready'

If the service is ready, you get a response similar to the following.

{"ready":true}

Install the Riva Python client package

sudo apt-get install python3-pip pip install -r https://raw.githubusercontent.com/nvidia-riva/python-clients/main/requirements.txt pip install --force-reinstall git+https://github.com/nvidia-riva/python-clients.git

Download Riva sample clients

git clone https://github.com/nvidia-riva/python-clients.git

Run Text to Speech inference

python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "Hello, this is a speech synthesizer." --language-code en-US --output output.wav

On running the above command, the synthesized audio file named output.wav will be created.

For more details on getting started with this NIM, visit the NVIDIA NIM Docs.