DaaS /
Products / RAG Pipeline: Embedding Search + LLM Inference
RAG Pipeline: Embedding Search + LLM Inference
Deploy an embedding model in OpenSearch for vector similarity retrieval, then deploy a large language model via PAI for online inference, combining both into a retrieval-augmented generation (RAG) pipeline.
Products involved
Scenario
Deploy an embedding model in OpenSearch for vector similarity retrieval, then deploy a large language model via PAI for online inference, combining both into a retrieval-augmented generation (RAG) pipeline.
How the products combine
- pai · pai-deploy-inference — Platform for AI (PAI) — Deploy a model for online inference
See pai/pai-deploy-inference.
- opensearch · opensearch-deploy-model — OpenSearch — Deploy embedding model for inference
See opensearch/opensearch-deploy-model.
Typical questions
- build RAG system
- deploy RAG pipeline
- embedding search plus LLM
- vector search and model inference
- retrieval augmented generation deploy
- 部署RAG系统
- 向量检索加大模型推理
- 构建检索增强生成