DaaS / Products / RAG Pipeline: Embedding Search + LLM Inference

RAG Pipeline: Embedding Search + LLM Inference

Deploy an embedding model in OpenSearch for vector similarity retrieval, then deploy a large language model via PAI for online inference, combining both into a retrieval-augmented generation (RAG) pipeline.

Products involved

Scenario

Deploy an embedding model in OpenSearch for vector similarity retrieval, then deploy a large language model via PAI for online inference, combining both into a retrieval-augmented generation (RAG) pipeline.

How the products combine

  1. pai · pai-deploy-inference — Platform for AI (PAI) — Deploy a model for online inference
  2. See pai/pai-deploy-inference.

  3. opensearch · opensearch-deploy-model — OpenSearch — Deploy embedding model for inference
  4. See opensearch/opensearch-deploy-model.

Typical questions