A team fine-tunes custom embedding and reranking models on PAI, deploys them via Bailian for neural reranking across both Elasticsearch and OpenSearch engines, implements hybrid retrieval (vector + BM25) in OpenSearch, then layers AIRec on top for personalized semantic search results delivered through a chatbot interface.
This combination is essential when building enterprise RAG chatbots that require high-recall hybrid search across multiple engines alongside personalized, user-aware result ranking. It bridges custom PAI-trained models with Bailian’s unified inference, OpenSearch/ES hybrid retrieval, and AIRec’s real-time personalization layer.
pai-dlc submit --image registry.cn-hangzhou.aliyuncs.com/pai/llm-finetune:v2 --oss-bucket oss://my-models/ --output oss://my-models/reranker-v1/.POST https://dashscope.aliyuncs.com/api/v1/models/deploy with payload {"model_id": "reranker-v1", "instance_type": "ml.gu7.xlarge"}.knn and bm25 in your index mapping. Query both engines with: POST /_search {"query": {"bool": {"should": [{"match": {"content": "$query"}}, {"knn": {"vector_field": {"vector": $embedding, "k": 50}}}]}}}.POST https://dashscope.aliyuncs.com/api/v1/services/rerank/rerank with {"model": "reranker-v1", "documents": [{"text": "...", "id": "..."}]}.POST /v2/openapi/instances/{instanceId}/actions/bulk. Configure AIRec to ingest Bailian scores as a custom feature: {"features": {"bailian_score": 0.87, "user_id": "u123"}}.User queries hit the chatbot orchestrator, which calls Bailian for query embedding. Embeddings route to OpenSearch and ES for parallel BM25 + vector retrieval. Top candidates merge and pass to Bailian for neural reranking. Final scores, combined with real-time user profiles, feed into AIRec for personalized ranking. AIRec returns the ordered list, which the chatbot passes to the LLM for response generation. PAI handles offline training and EAS fallback serving.
dashscope and model-studio permissionsknn plugin enableddense_vector mapping; otherwise, kNN queries fail with 400 Bad Request.user_id and item_id in both search results and AIRec feeds, personalization defaults to popularity-based ranking.min-max scaling before merging.Q: How do you build a cross-engine RAG system with hybrid retrieval and personalized recommendations? A: This workflow is achieved by fine-tuning custom embedding and reranking models on PAI and deploying them via Bailian for neural reranking across both Elasticsearch and OpenSearch. The system implements hybrid vector and BM25 retrieval in OpenSearch before layering AIRec on top for personalized recommendations.