
Accurate and optimized English transcriptions with punctuation and word timestamps
Riva uses gRPC APIs. Instructions below demonstrate usage of parakeet-tdt-0_6b-v2 model using Python gRPC client.
You will need a system with Git and Python 3+ installed.
pip install -U nvidia-riva-client
Download Python client code by cloning Python Client Repository.
git clone https://github.com/nvidia-riva/python-clients.git
Open a command terminal and execute below command to transcribe audio. Make sure you have a speech file in 16-bit Mono format in WAV/OGG/OPUS container. If you have generated the API key, it will be auto-populated in the command.
Below command demonstrates transcription of English audio file.
python python-clients/scripts/asr/transcribe_file_offline.py \
--server grpc.nvcf.nvidia.com:443 --use-ssl \
--metadata function-id "d3fe9151-442b-4204-a70d-5fcc597fd610" \
--metadata "authorization" "Bearer $NVIDIA_API_KEY" \
--language-code en-US \
--word-time-offsets --automatic-punctuation \
--input-file <path_to_audio_file>
Riva uses gRPC APIs. Proto files can be downloaded from Riva gRPC Proto files and compiled to target language using Protoc compiler. Example Riva clients in C++ and Python languages are provided below.