DaaS / Products / Custom Search Relevance Model Pipeline

Custom Search Relevance Model Pipeline

A developer fine-tunes a custom embedding or reranking model on PAI using domain-specific search data, deploys it as a managed inference endpoint on Bailian, then integrates it into Elasticsearch for neural reranking to optimize search result relevance for their application.

Products involved

Scenario

Use this pipeline when generic search models fail to capture domain-specific terminology or ranking preferences. By fine-tuning a custom embedding or cross-encoder reranker on PAI, deploying it as a managed Bailian endpoint, and wiring it into Elasticsearch’s neural reranking pipeline, you achieve highly tailored, production-grade search relevance without provisioning dedicated inference clusters.

Integration steps

  1. Stage training data in OSS: Upload query-document relevance pairs to an OSS bucket. Run ossutil cp ./train_data.jsonl oss://<bucket>/pai-training/.
  2. Fine-tune on PAI: Trigger a PAI-DLC job using the pai-deploy-inference intent pattern. Configure the training spec:
  3. ``bash pai-cli job submit --name rerank-ft \ --image registry.cn-hangzhou.aliyuncs.com/pai/pytorch:2.1 \ --code oss://<bucket>/pai-training/train.py \ --data oss://<bucket>/pai-training/ \ --output oss://<bucket>/pai-training/model_artifacts/ \ --instance_type ml.gn7i-c8g1.2xlarge ``

  4. Deploy via Bailian: Register the checkpoint and provision an endpoint using bailian-deploy-model:
  5. ``bash curl -X POST https://dashscope.aliyuncs.com/api/v1/models/deploy \ -H "Authorization: Bearer $BAILIAN_API_KEY" \ -d '{"model_name": "custom-rerank-v1", "source_uri": "oss://<bucket>/pai-training/model_artifacts/", "instance_type": "ml.gu7i.large"}' ``

  6. Configure ES rerank pipeline: Follow es-optimize-results routing to install the neural-search plugin. Create a search pipeline that proxies to Bailian:
  7. ``json PUT /_search/pipeline/bailian-rerank { "description": "Neural rerank via Bailian", "request_processors": [{ "neural_rerank": { "model_id": "custom-rerank-v1", "endpoint": "https://dashscope.aliyuncs.com/api/v1/services/rerank", "field": "query_text", "top_k": 10, "headers": {"Authorization": "Bearer $BAILIAN_API_KEY"} } }] } ``

  8. Execute reranked queries: Attach the pipeline to your search request:
  9. ``json GET /catalog/_search?search_pipeline=bailian-rerank { "query": {"multi_match": {"query": "enterprise storage", "fields": ["title^2", "desc"]}}, "size": 5 } ``

Architecture

Training data flows from OSS into PAI-DLC for distributed fine-tuning. The resulting model weights are persisted in OSS and registered in Bailian’s model catalog. Bailian provisions a serverless inference endpoint that exposes a REST API compatible with ES/OpenSearch neural plugins. At query time, ES performs initial BM25/vector retrieval, forwards top-N candidates to the Bailian endpoint for cross-encoder scoring, and reorders results before returning them to the client.

Prerequisites

Common pitfalls

Typical questions

FAQ

Q: How do I fine-tune and deploy a custom search relevance model for Elasticsearch? A: You can implement this by fine-tuning a custom embedding or reranking model on PAI, deploying it as a managed inference endpoint on Bailian, and integrating it into Elasticsearch for neural reranking. This pipeline uses domain-specific search data to optimize your application's search result relevance.