Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.
Request the model checkpoint from Stability AI
Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences. This model was trained to generate 25 frames at resolution 576x1024 given a context frame of the same size, fine tuned from SVD Image-to-Video [14 frames].
Developed by: Stability AI Funded by: Stability AI Model type: Generative image-to-video model
By using this software or model, you are agreeing to the terms and conditions of the license, acceptable use policy and Stability’s privacy policy.
Stable Video Diffusion model Card
Architecture Type: Convolutional Neural Network (CNN)
Network Architecture: UNet + attention blocks
Model Version: SVD XT
Input Format: Red, Green, Blue (RGB) Image
Input Parameters: motion_bucket_id, frames_per_second, guidance_scale, seed
Output Format: Video
Output Parameters: seed
Supported Hardware Platform(s): Hopper, Ampere/Turing
Supported Operating System(s): Linux
Engine: Triton
Test Hardware: Other
You can request the model checkpoint from Stability AI