A developer builds a full custom RAG system — training both a domain-specific LLM and custom embedding models on PAI, with vector retrieval via OpenSearch/Elasticsearch and OSS — then layers comprehensive search relevance tuning including neural reranking with the custom model, BM25 weight configuration, and synonym management for production-grade retrieval quality.
Use this workflow when building a production-grade RAG system requiring domain-specific accuracy and granular search control. By training custom embedding and LLM models on PAI, deploying them via Bailian, and orchestrating hybrid retrieval through OpenSearch with OSS-backed storage, you achieve precise neural reranking alongside tuned BM25 and synonym configurations.
ossutil cp -r ./domain_data/ oss://rag-bucket/corpus/pai submit --workspace ws-123 --job-type DLC --config '{"model": "qwen-7b", "train_data": "oss://rag-bucket/corpus/"}'
curl -X POST https://bailian.aliyuncs.com/v1/models/deploy -H "Authorization: Bearer $KEY" -d '{"model_id": "pai-embed-v1", "instance_type": "ml.gu7.xlarge"}'
es.bulk(index="rag_docs", body=[{"index": {"_id": id}}, {"text": t, "vector": v}] for t, v in zip(texts, vectors))
PUT /rag_docs/_mapping {"properties": {"text": {"type": "text", "similarity": "BM25", "boost": 0.6}}} PUT /rag_docs/_settings {"analysis": {"filter": {"syn": {"type": "synonym", "synonyms_path": "dict.txt"}}}}
PUT /_search/pipeline/neural-rerank {"phase_results_processors": [{"rerank": {"model_id": "pai-rerank-v1", "endpoint": "https://bailian.aliyuncs.com/v1/rerank", "weight": 0.8}}]}
GET /rag_docs/_search {"pipeline": "neural-rerank", "query": {"hybrid": {"queries": [{"match": {"text": "query"}}, {"knn": {"vector": [...], "k": 50}}]}}}
Raw documents in OSS feed PAI for model training. PAI outputs optimized embedding/reranker weights, deployed as low-latency endpoints on Bailian. OpenSearch stores dense vectors and inverted indices, applying BM25/synonym filters at query time. It calls the Bailian reranker to re-score top-k candidates, then passes the optimized context to the Bailian-hosted LLM for generation.
search_pipeline pluginalibabacloud-bailian-sdk, opensearch-py, and oss2knn field dimensions.syn1, syn2 => canonical syntax; malformed lines break index mapping.timeout: 60s in the pipeline config.file -i before ingestion.Q: How do I build a custom RAG pipeline with optimized search relevance? A: You can build a custom RAG system with optimized search relevance by combining PAI for training, OpenSearch or Elasticsearch with OSS for retrieval, and Bailian for deployment. This workflow supports training domain-specific LLMs and custom embeddings on PAI, then applying comprehensive tuning such as neural reranking, BM25 weight configuration, and synonym management.