DaaS / Products / RAG Pipeline: Embedding Search + LLM Inference

RAG Pipeline: Embedding Search + LLM Inference

Deploy an embedding model in OpenSearch for vector similarity retrieval, then deploy a large language model via PAI for online inference, combining both into a retrieval-augmented generation (RAG) pipeline.

Products involved

Scenario

How the products combine

pai · pai-deploy-inference — Platform for AI (PAI) — Deploy a model for online inference

See pai/pai-deploy-inference.

opensearch · opensearch-deploy-model — OpenSearch — Deploy embedding model for inference

See opensearch/opensearch-deploy-model.

Typical questions

build RAG system
deploy RAG pipeline
embedding search plus LLM
vector search and model inference
retrieval augmented generation deploy
部署RAG系统
向量检索加大模型推理
构建检索增强生成