A developer fine-tunes custom embedding and reranking models on PAI, deploys them to Bailian as managed inference endpoints, configures Elasticsearch with neural reranking plus BM25/synonym tuning for optimized relevance, then builds on that search foundation a full RAG chatbot paired with an AI recommendation engine sharing the same retrieval infrastructure.
Use this workflow when building a domain-specific RAG chatbot that requires high retrieval precision and personalized recommendations. It’s ideal for developers who need to fine-tune embeddings and rerankers on PAI, serve them via Bailian, and unify search and recommendation logic under a single OpenSearch/Elasticsearch retrieval layer.
oss://bucket/data/. Launch a PAI-DLC job: pai-dlc create-job --name emb-tune --image pai/pytorch:2.0 --script train.py --oss-input oss://bucket/data/ --oss-output oss://bucket/models/.POST /v1/models {"model_name":"custom-emb","oss_path":"oss://bucket/models/"} then POST /v1/endpoints {"model_id":"m-123"} to get endpoint_id: ep-abc.PUT /rag-docs {"mappings":{"properties":{"content":{"type":"text","analyzer":"ik_max_word"},"embedding":{"type":"knn_vector","dimension":768}}},"settings":{"index.knn":true}}
PUT /rag-docs/_settings {"index.similarity.default.type":"BM25","index.analysis.filter.synonyms.synonyms":["ai, artificial intelligence"]}.GET /rag-docs/_search {"query":{"neural":{"embedding":{"query_text":"prompt","model_id":"ep-abc","k":50}}}}
POST /v2/instances/{id}/datasources {"type":"elasticsearch","endpoint":"es-url","index":"rag-docs"}. Push interaction logs and call POST /v2/recommend to blend results.Documents flow from OSS to PAI for contrastive fine-tuning. Exported weights are deployed to Bailian as REST inference endpoints. OpenSearch/ES acts as the unified retrieval backbone, storing BM25 text, synonym rules, and KNN vectors. The RAG app queries ES, passes top chunks to a Bailian LLM, and returns answers. AIRec consumes the same ES index and user logs to serve personalized recommendations alongside chat responses.
knn and neural-search plugins enabled.dimension in _mapping matches model config.bool queries with tuned boost values per clause.explore strategy in API calls.Q: How do I construct a full pipeline that trains custom models, optimizes Elasticsearch relevance, and combines a RAG chatbot with an AI recommendation engine? A: This pipeline is constructed by fine-tuning custom embedding and reranking models on PAI, deploying them to Bailian as managed inference endpoints, and configuring Elasticsearch with neural reranking and BM25 tuning to power both a RAG chatbot and an AI recommendation engine. The entire setup shares a unified retrieval infrastructure and can be deployed using predefined combination skills like "Train Custom Model, Optimize ES Relevance" and "Custom Model-Enhanced RAG Recommendation Platform."