DaaS / Products / Deploy Enterprise RAG Application Stack

Deploy Enterprise RAG Application Stack

Use Terraform to provision the underlying cloud infrastructure (VPC, ECS clusters, networking), deploy a RAG AI application on Elasticsearch for vector-based knowledge retrieval, and configure RDS database accounts and permissions for the application's relational backend storing user sessions, metadata, and conversation logs.

Products involved

Scenario

Developers use this stack when building production AI assistants that require scalable vector search for private document retrieval alongside a relational database for user state, audit trails, and conversation history. Terraform automates secure provisioning of isolated networking, compute, and managed data services, while Elasticsearch and RDS handle the AI retrieval and operational workloads respectively.

Integration steps

Provision Network & Compute: Follow terraform-provision-infrastructure. Define alicloud_vpc, alicloud_vswitch, and alicloud_ecs_instance in main.tf. Run terraform init && terraform apply -auto-approve.
Deploy Elasticsearch Cluster: Add alicloud_elasticsearch_instance with instance_type: "elasticsearch.n4.small.2l" and enable_vector_search: true. Apply to spin up the vector engine.
Provision RDS Backend: Add alicloud_db_instance with engine: "PostgreSQL" and engine_version: "14.0". Reference the VPC ID to ensure private subnet placement.
Configure RDS Accounts: Execute rds-manage-accounts CLI path: aliyun rds CreateAccount --DBInstanceId <rds-id> --AccountName rag_app --AccountPassword <pwd>. Grant access: aliyun rds GrantAccountPrivilege --DBInstanceId <rds-id> --AccountName rag_app --DBName rag_metadata --AccountPrivilege ReadWrite.
Deploy RAG Container to ECS: Inject env vars: ES_ENDPOINT=https://<es-id>.elasticsearch.aliyuncs.com:9200, RDS_HOST=<rds-endpoint>, RDS_PORT=5432. Start the app via docker run -d -e ES_ENDPOINT -e RDS_HOST <rag-image>.
Initialize Vector Index & Ingest: Per es-deploy-application, create mapping: PUT /rag_docs/_mapping {"mappings":{"properties":{"embedding":{"type":"dense_vector","dims":1536}}}}. Bulk ingest via POST /rag_docs/_bulk with chunked text and model-generated vectors.
Validate Flow: Send query to app. Confirm k-NN retrieval hits ES, while app persists INSERT INTO conversation_logs (session_id, query, response) VALUES (...) to RDS.

Architecture

Terraform acts as the declarative control plane, provisioning a private VPC, ECS compute nodes, an Elasticsearch cluster, and an RDS instance. The RAG application runs on ECS, routing user queries to Elasticsearch for dense-vector similarity search. Retrieved context is synthesized by an LLM, while the application concurrently writes session state, metadata, and logs to the RDS PostgreSQL backend. All inter-service traffic stays within the VPC via private endpoints.

Prerequisites

Alibaba Cloud RAM user with AliyunElasticsearchFullAccess, AliyunRDSFullAccess, and AliyunECSFullAccess
Terraform CLI v1.5+ with alicloud provider configured
Pre-built Docker image for the RAG service
Valid API keys for embedding model and LLM provider

Common pitfalls

Security Group Misconfiguration: ECS fails to reach ES/RDS if ingress rules don't explicitly allow ports 9200/5432 from the ECS security group ID.
Vector Dimension Mismatch: ES dims parameter must exactly match the embedding model output (e.g., 1536); mismatch causes indexing rejection or degraded recall.
RDS Connection Exhaustion: RAG apps often open unbounded DB connections without pooling, triggering FATAL: too many connections errors under load.
Terraform State Drift: Manual console changes to VPC routing or ES cluster scaling cause terraform plan to propose destructive resource replacements.

Typical questions

deploy full RAG stack
set up enterprise RAG infrastructure
provision and deploy RAG application
end-to-end RAG deployment
build RAG chatbot with backend
terraform RAG deploy
部署RAG应用
搭建企业级RAG系统

FAQ

Q: How do I deploy an enterprise RAG application stack? A: You deploy the full enterprise RAG stack by combining Terraform, Elasticsearch, and ApsaraDB RDS to provision infrastructure, run the AI application, and manage the relational backend. This integrated workflow automatically handles cloud networking, vector-based knowledge retrieval, and database account configuration for storing user sessions and conversation logs.