Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.