DaaS / Products / Deploy Enterprise RAG Application Stack

Deploy Enterprise RAG Application Stack

Use Terraform to provision the underlying cloud infrastructure (VPC, ECS clusters, networking), deploy a RAG AI application on Elasticsearch for vector-based knowledge retrieval, and configure RDS database accounts and permissions for the application's relational backend storing user sessions, metadata, and conversation logs.

Products involved

Scenario

Developers use this stack when building production AI assistants that require scalable vector search for private document retrieval alongside a relational database for user state, audit trails, and conversation history. Terraform automates secure provisioning of isolated networking, compute, and managed data services, while Elasticsearch and RDS handle the AI retrieval and operational workloads respectively.

Integration steps

  1. Provision Network & Compute: Follow terraform-provision-infrastructure. Define alicloud_vpc, alicloud_vswitch, and alicloud_ecs_instance in main.tf. Run terraform init && terraform apply -auto-approve.
  2. Deploy Elasticsearch Cluster: Add alicloud_elasticsearch_instance with instance_type: "elasticsearch.n4.small.2l" and enable_vector_search: true. Apply to spin up the vector engine.
  3. Provision RDS Backend: Add alicloud_db_instance with engine: "PostgreSQL" and engine_version: "14.0". Reference the VPC ID to ensure private subnet placement.
  4. Configure RDS Accounts: Execute rds-manage-accounts CLI path: aliyun rds CreateAccount --DBInstanceId <rds-id> --AccountName rag_app --AccountPassword <pwd>. Grant access: aliyun rds GrantAccountPrivilege --DBInstanceId <rds-id> --AccountName rag_app --DBName rag_metadata --AccountPrivilege ReadWrite.
  5. Deploy RAG Container to ECS: Inject env vars: ES_ENDPOINT=https://<es-id>.elasticsearch.aliyuncs.com:9200, RDS_HOST=<rds-endpoint>, RDS_PORT=5432. Start the app via docker run -d -e ES_ENDPOINT -e RDS_HOST <rag-image>.
  6. Initialize Vector Index & Ingest: Per es-deploy-application, create mapping: PUT /rag_docs/_mapping {"mappings":{"properties":{"embedding":{"type":"dense_vector","dims":1536}}}}. Bulk ingest via POST /rag_docs/_bulk with chunked text and model-generated vectors.
  7. Validate Flow: Send query to app. Confirm k-NN retrieval hits ES, while app persists INSERT INTO conversation_logs (session_id, query, response) VALUES (...) to RDS.

Architecture

Terraform acts as the declarative control plane, provisioning a private VPC, ECS compute nodes, an Elasticsearch cluster, and an RDS instance. The RAG application runs on ECS, routing user queries to Elasticsearch for dense-vector similarity search. Retrieved context is synthesized by an LLM, while the application concurrently writes session state, metadata, and logs to the RDS PostgreSQL backend. All inter-service traffic stays within the VPC via private endpoints.

Prerequisites

Common pitfalls

Typical questions

FAQ

Q: How do I deploy an enterprise RAG application stack? A: You deploy the full enterprise RAG stack by combining Terraform, Elasticsearch, and ApsaraDB RDS to provision infrastructure, run the AI application, and manage the relational backend. This integrated workflow automatically handles cloud networking, vector-based knowledge retrieval, and database account configuration for storing user sessions and conversation logs.