nvidia/audio2face
PREVIEWConverts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Getting Started
Audio2Face uses gRPC APIs. The following instructions demonstrate usage of a model using Python client. The current available models are Mark and Claire.
Prerequisites
You will need a system with Python 3+ installed.
Prepare Python Client
Start by creating a python venv using
$ python3 -m venv .venv
$ source .venv/bin/activate
Download and Install Proto Files
Python wheel available for download at here
$ pip3 install nvidia_ace-1.0.0-py3-none-any.whl
Download A2F Python Client
Download Python client code by cloning ACE Github Repository.
$ git clone https://github.com/NVIDIA/ACE.git
$ cd ACE/microservices/audio_2_face_microservice/scripts/audio2face_api_client
Then install the required dependencies:
$ pip3 install -r requirements
Run Python Client
To run with Claire model:
$ python ./nim_a2f_client.py ./audio/sample.wav ./config/config_claire.yml \
--apikey $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC\
--function-id 462f7853-60e8-474a-9728-7b598e58472c
To run with Mark model:
$ python ./nim_a2f_client.py ./audio/sample.wav ./config/config_mark.yml \
--apikey $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC\
--function-id 945ed566-a023-4677-9a49-61ede107fd5a
The script takes four mandatory parameters, an audio file at format PCM 16 bits, a yaml configuration file for the emotions parameters, the API Key generated by API Catalog, and the Function ID used to access the API function.
--apikey for the API Key generated through the API Catalog --function-id for the Function ID provided to access the API function for the model of interest
What does this example do?
- Reads the audio data from a wav 16bits PCM file
- Reads emotions and parameters from the yaml configuration file
- Sends emotions, parameters and audio to the A2F Controller
- Receives back blendshapes, audio and emotions
- Saves blendshapes as animation key frames in a csv file with their name, value and time codes
- Same process for the emotion data.
- Saves the received audio as out.wav (Should be the same as input audio)
Connect from any client
For gRPC connection from any client, use the following endpoint and function-id alongside the API Key. To generate a new API Key, click the Get API Key button on this page.
grpc.nvcf.nvidia.com:443 or https://grpc.nvcf.nvidia.com:443
authorization: Bearer $API_KEY_REQUIRED_IF_EXECUTING_OUTSIDE_NGC
function-id: <function ID>
Function IDs
Mark model: 945ed566-a023-4677-9a49-61ede107fd5a
Claire model: 462f7853-60e8-474a-9728-7b598e58472c