This blueprint extends the existing NVIDIA AI Blueprint - AI Virtual Assistant for Customer Service - to demonstrate how traceability can be added to an AI Agent.
Weights & Biases Weave framework helps customers evaluate, monitor, and iterate on their AI applications to accelerate the development and deployment process. Weave enables continuous improvement in quality, latency, cost, and safety by running comprehensive evaluations, keeping pace with new models, debugging, and monitoring production performance—all while ensuring secure collaboration. Any enterprise wanting to take a generative AI application from pilot to production needs to have a way to monitor, evaluate and iterate for gaining insights on how their application is performing and to further power the data flywheel.
Developers can use this reference blueprint to extend W&B Weave capabilities to the AI Virtual Assistant for Customer Service blueprint or apply it to another NVIDIA AI Blueprint.
Architecture Diagram
Key Features
This blueprint extension achieves the following:
- Showcases how Weights & Biases integrates into the workflow to provide seamless tracing, evaluations, and iteration tooling, ensuring a more efficient and accelerated iteration and promotion process
- Demonstrates how to bring an AI application closer to production readiness by adding Weave from Weights and Biases
Minimum System Requirements
The solution leverages NVIDIA's cloud-based API Catalog endpoints, eliminating the need for local GPU hardware. All model inference is performed on NVIDIA's cloud infrastructure.
Software used in this blueprint
NIM microservices
- NVIDIA NeMo Retriever embedding NIM
- NVIDIA NeMo Retriever Mistral 4B reranking NIM
- Llama 3.1 70B instruct NIM
- Nemotron-4 340B NIM
Please refer to AI Virtual Assistant for Customer Service for the details on the foundational blueprint.
3rd-Party Technologies
Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.
License
Use of the models in this blueprint is governed by the NVIDIA AI Foundation Models Community License.
Terms of Use
GOVERNING TERMS: The blueprint is governed by the NVIDIA Agreements | Enterprise Software | NVIDIA Software License Agreement and NVIDIA Agreements | Enterprise Software | Product Specific Terms for AI Product.
Meta Llama 3.1 70B Instruct
GOVERNING TERMS: The NIM container is governed by the NVIDIA Software License Agreement and the Product Specific Terms for AI Products;
Use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement. ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.
NVIDIA Retrieval QA E5 Embedding Model
GOVERNING TERMS: The NIM container is governed by NVIDIA Agreements | Enterprise Software | NVIDIA Software License Agreement and NVIDIA Agreements | Enterprise Software | Product Specific Terms for AI Product; and the use of this model is governed by the ai-foundation-models-community-license.pdf (nvidia.com). ADDITIONAL INFORMATION: MIT license.
NeMo Retriever QA Mistral 4B Reranking v3
GOVERNING TERMS: The NIM container is governed by NVIDIA Agreements | Enterprise Software | NVIDIA Software License Agreement and NVIDIA Agreements | Enterprise Software | Product Specific Terms for AI Product; and the use of this model is governed by the ai-foundation-models-community-license.pdf (nvidia.com). ADDITIONAL INFORMATION: Apache license.