Multi-lingual model supporting speech-to-text recognition and translation.
Riva uses gRPC APIs. Instructions below demonstrate usage of canary-1b-asr model using Python gRPC client.
You will need a system with Git and Python 3+ installed.
pip install nvidia-riva-client
Download Python client code by cloning Python Client Repository.
git clone https://github.com/nvidia-riva/python-clients.git
Make sure you have a speech file in Mono, 16-bit audio in WAV, OPUS and FLAC formats. If you have generated the API key, it will be auto-populated in the command. Open a command terminal and execute below command to transcribe audio. If you know the source language, it is recommended to pass source_language
in custom configuration parameter.
Below command demonstrates transcription of English audio file.
python python-clients/scripts/asr/transcribe_file_offline.py \ --server grpc.nvcf.nvidia.com:443 --use-ssl \ --metadata function-id "ee8dc628-76de-4acc-8595-1836e7e857bd" \ --metadata "authorization" "Bearer $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC" \ --language-code en-US \ --input-file <path_to_audio_file>
Below command demonstrates translation from English audio to Hindi.
python python-clients/scripts/asr/transcribe_file_offline.py \ --server grpc.nvcf.nvidia.com:443 --use-ssl \ --metadata function-id "ee8dc628-76de-4acc-8595-1836e7e857bd" \ --metadata "authorization" "Bearer $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC" \ --language-code en-US \ --custom-configuration "target_language:hi-IN,task:translate" \ --input-file <path_to_audio_file>
One can transcribe and translate supported languages by changing the source language via --language-code
and target language via target_language
parameter.
Riva uses gRPC APIs. Proto files can be downloaded from Riva gRPC Proto files and compiled to target language using Protoc compiler. Example Riva clients in C++ and Python languages are provided below.