Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Create real-time digital twins by combining accelerated solvers, simulation AI, and virtual environments.
Enhance speech by correcting common audio degradations to create studio quality speech output.
Leaderboard topping reward model supporting RLHF for better alignment with human preferences.
Create intelligent, interactive avatars for customer service across industries
Estimate gaze angles of a person in a video and redirect to make it frontal.
Create facial animations using a portrait photo and synchronize mouth movement with audio.
VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.