Riva uses gRPC APIs. Instructions below demonstrate usage of whisper-large-v3 model using Python gRPC client.
You will need a system with Git and Python 3+ installed.
pip install nvidia-riva-client
Download Python client code by cloning Python Client Repository.
git clone https://github.com/nvidia-riva/python-clients.git
Make sure you have a speech file in Mono, 16-bit audio in WAV, OPUS and FLAC formats. If you have generated the API key, it will be auto-populated in the command. Open a command terminal and execute below command to transcribe audio. Specifying --language-code
as multi
will enable auto language detection. If you know the source language, it is recommended to specify for better accuracy and latency. See Supported Languages for the list of all available languages and corresponding code.
python python-clients/scripts/asr/transcribe_file_offline.py \ --server grpc.nvcf.nvidia.com:443 --use-ssl \ --metadata function-id "b702f636-f60c-4a3d-a6f4-f3568c13bd7d" \ --metadata "authorization" "Bearer $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC" \ --language-code en \ --input-file <path_to_audio_file>
Below command demonstrates translation from French (fr) to English.
python python-clients/scripts/asr/transcribe_file_offline.py \ --server grpc.nvcf.nvidia.com:443 --use-ssl \ --metadata function-id "b702f636-f60c-4a3d-a6f4-f3568c13bd7d" \ --metadata "authorization" "Bearer $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC" \ --language-code fr \ --custom-configuration "task:translate" \ --input-file <path_to_audio_file>
Riva uses gRPC APIs. Proto files can be downloaded from Riva gRPC Proto files and compiled to target language using Protoc compiler. Example Riva clients in C++ and Python languages are provided below.