DaaS / Products / Bailian RAG + ES Chatbot with EventBridge Alerts

Bailian RAG + ES Chatbot with EventBridge Alerts

Build a complete RAG pipeline using Bailian for document ingestion, chunking, and embedding, deploy a RAG chatbot application on Elasticsearch as the vector search engine, and integrate EventBridge to route low-confidence answers and user feedback to DingTalk/Lark for human review — forming a closed-loop enterprise knowledge base with continuous quality improvement.

Products involved

Scenario

Use this combination when building an enterprise-grade RAG chatbot that requires high-accuracy vector search, continuous quality monitoring, and human-in-the-loop review. Bailian handles document parsing, chunking, and embedding, Elasticsearch powers low-latency ANN retrieval and chatbot orchestration, and EventBridge automatically routes low-confidence responses or negative user feedback to DingTalk/Lark for rapid knowledge base refinement.

Integration steps

  1. Ingest & Embed in Bailian: Upload documents and trigger chunking/embedding via DashScope.
  2. POST https://dashscope.aliyuncs.com/api/v1/knowledge-bases/{kb_id}/documents with {"chunk_size": 500, "embedding_model": "text-embedding-v3"}.

  3. Sync Vectors to Elasticsearch: Export embeddings and bulk-index into ES. Ensure mapping includes "type": "dense_vector", "dims": 1024, "index": true, "similarity": "cosine".
  4. Deploy RAG Chatbot on ES: Configure the retrieval pipeline using the ES AI plugin.
  5. PUT /_ml/trained_models/bailian-rag-pipeline with a pipeline linking the vector field to https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation.

  6. Capture Confidence & Feedback: In your app layer, extract confidence_score and append a feedback hook. Emit structured JSON: {"event_type": "rag_response", "confidence": 0.62, "user_feedback": "thumbs_down", "query_id": "uuid"}.
  7. Configure EventBridge Rule: Create a rule to filter low-confidence events.
  8. aliyun eb CreateRule --RuleName LowConfidenceRAG --EventPattern '{"source": ["custom.rag-app"], "detail": {"confidence": [{"numeric": ["<", 0.75]}]}}'

  9. Route to DingTalk/Lark: Attach a webhook target. Map the payload to DingTalk’s schema:
  10. {"msgtype": "markdown", "markdown": {"title": "RAG Review Needed", "text": "Query: <detail.query_id>\nConfidence: <detail.confidence>"}}

Architecture

Documents flow into Bailian for parsing, chunking, and vectorization. Embeddings sync to an Elasticsearch dense_vector index for fast ANN retrieval. User queries hit the ES-hosted chatbot, which retrieves top-k chunks, forwards them to Bailian’s LLM for synthesis, and returns the answer. Concurrently, the application emits telemetry (confidence scores, feedback) to EventBridge. EventBridge evaluates payloads against threshold rules and pushes formatted alerts to DingTalk/Lark webhooks, enabling SMEs to update the Bailian knowledge base and close the quality loop.

Prerequisites

Common pitfalls

Typical questions