With AI shopping assistants, retailers can deliver more engaging customer interactions, around the clock and across the world.

A retail shopping assistant needs to have personalization and the ability to answer long-tail, complex questions. However, current tools typically only perform well for shorter based, keyword oriented queries. This leads to scenarios where customers cannot find everything they are seeking or require assistance in ideating on the products they need. Whether it is curating all the necessary items for a backyard or trying to create a soccer themed birthday party for your child, the search process can often take multiple attempts and in some cases to no avail. This not only frustrates the consumer, but also represents a lost opportunity for the retailer to capture revenue and drive up-sell and cross-sell.

This NVIDIA AI Blueprint provides a reference example to enhance customer experiences, drive higher conversion rates, lower product return rates and increase the average size of orders through highly intelligent, personalized suggestions of complementary products or upgrades. It shows developers how NVIDIA NIM™ microservices can be used to develop solutions that enable more natural, personalized online shopping experiences. It features NVIDIA AI Enterprise software, including NVIDIA NIM™ microservices for Meta Llama 3.1 70B, NVIDIA Retrieval QA E5 Embedding v5 to deliver AI performance at scale, and NVIDIA NeMo Guardrails safety features.

Architecture Diagram

Key Features

  • An end-to-end sample multimodal, multi-query agentic RAG pipeline that includes image-to-image similarity search with NVClip, enabling consumers to use text and images in queries
  • Optimized LLM inference performance and scaling through NIM, including the Llama 3.1 70B NIM microservice bringing reasoning capabilities to AI shopping assistants for natural, humanlike interactions
  • Guardrails that help ensure customer conversations with the shopping assistant remain safe and on-topic, protecting brand values
  • World-class information retrieval delivers high accuracy and data privacy with NVIDIA Retrieval QA E5 Embedding v5
  • Integration with LangChain and the NVIDIA cuVS GPU-accelerated Milvus vector database​ (illustrated in the below workflow)
  • Sample retail product catalog and imagery with the ability to ingest retailers’ product catalog text and image data for accurate, context-aware responses
  • The flexibility to use other models from the NVIDIA API catalog or self-hosted models

Minimum System Requirements

Hardware Requirements

  • Expect that you will want the NIM microservices to be self-hosted as you progress in your RAG development. For self-hosting the blueprint with these microservices locally deployed, the recommended system requirement is 4 H100 GPUs with the Llama 3.1 70B NIM, the NVIDIA Retrieval QA E5 Embedding and NVCLIP NIMs, and the Milvus database accelerated with NVIDIA cuVS.

Deployment Options

  • Docker

Software Used in This Blueprint

NVIDIA Technology

3rd Party Software

  • LangChain
  • Milvus database (accelerated with NVIDIA cuVS)
  • SQLite

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.

License

GOVERNING TERMS: Use of the blueprint software and materials and NIM containers are governed by the NVIDIA Software License Agreement and Product-specific Terms for AI products; and the use of models is governed by the NVIDIA Community Model License.

ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement for Llama 3.1 70B Instruct NIM, Llama 3.1 NemoGuard 8B - Content Safety and Llama 3.1 NemoGuard 8B - Topic Control models, built with Llama, (ii) MIT license for NV-EmbedQA-E5-v5.

Use of the product catalog data in the retail shopping assistant is governed by the terms of the NVIDIA Data License for Retail Shopping Assistant (15Aug2025).