shutterstock/edify-360-hdri

Shutterstock Generative 3D service for 360 HDRi generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries.

Model Overview

Description:

Shutterstock Generative 3D for 360 HDRi powered by NVIDIA Edify generates high resolution (up to 16K) high dynamic range 32-bit 360 degree panoramas from text prompt or optional 1K base image.

References:

This model builds on large-scale diffusion foundation models and ldr2hdr models.

[1] Balaji, Y., Nah, S., Huang, X., Vahdat, A., Song, J., Kreis, K., Aittala, M., Aila, T., Laine, S., Catanzaro, B. and Karras, T., 2022. ediffi: Text-to-image diffusion models with an ensemble of expert denoisers. arXiv preprint arXiv:2211.01324.

[2] Zhang, L., Rao, A. and Agrawala, M., 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3836-3847).

[3] Chen, Z., Wang, G. and Liu, Z., 2022. Text2light: Zero-shot text-driven hdr panorama generation. ACM Transactions on Graphics (TOG), 41(6), pp.1-16.

Model Architecture:

Architecture Type: Convolution Neural Network (CNN) and Transformer
Network Architecture: Unet-Based CNN and Transformer
This model is based on diffusion architecture and Transformer architecture.

Input:

Input Type(s): Text (Prompt), Image (Optional)
Input Format(s): Text: Raw and Image: Red, Green, Blue (RGB)
Input Parameters: Text: One-Dimensional (1D) and Image: Two-Dimensional (2D, optional)
Other Properties Related to Input: Max 77 text tokens. Input minimum image resolution (if provided) is 1024 x 1024.

Output:

Output Type(s): Image
Output Format: .hdr or optionally .exr format; 4K, 8K, or 16K (4096x2048, 8192x4096, or 16384x8192) resolution.

Software Integration:

Supported Hardware Microarchitecture Compatibility:

  • NVIDIA Ampere
    [Preferred/Supported] Operating System(s):
  • Linux

Model Version(s):

Edify 360 v0.2.1

Training & Evaluation:

Training Dataset:

Link: Shutterstock Images, DoschDesign, HDRmaps, HDRISkies, CGIBackgrounds, HDRI Haven
** Data Collection Method by dataset

  • Customer data
    ** Labeling Method by dataset
  • Automated
    Properties (Quantity, Dataset Descriptions, Sensor(s)): 550 million image-text pairs of licensed high quality photography, and illustrations. Plus about 17k HDR images of 360-degree natural scenes.

Evaluation Dataset:

** Data Collection Method by dataset

  • Customer data
    Properties (Quantity, Dataset Descriptions, Sensor(s)): Data contains: 550 million image-text pairs of licensed high quality photography, and illustrations. Plus about 17k HDR images of 360-degree natural scenes.

Inference:

Engine: Tensor(RT), Triton
Test Hardware:

  • NVIDIA H100

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.

Contact:

https://www.shutterstock.com/help