Train and serve custom embedding/ranking models on Alibaba Cloud Linux instances, deploy those embeddings into OpenSearch for vector-based semantic retrieval, then layer AIRec on top to orchestrate a full recommendation pipeline that leverages both custom model inference and semantic search for higher-quality personalized recommendations.
Use this integration when standard recommendation algorithms lack domain-specific context or semantic understanding. By training custom embedding and ranking models on Alibaba Cloud Linux (ALinux) instances, indexing vectors in OpenSearch, and orchestrating the pipeline through AIRec, you achieve highly personalized, context-aware recommendations that combine semantic recall with fine-tuned ranking.
alinux-deploy-model): Provision a GPU-enabled ECS instance and serve your PyTorch/ONNX model via FastAPI:``bash aliyun ecs RunCommand --InstanceId i-uf6xxxx --CommandContent "pip install fastapi uvicorn torch && python serve_model.py --port 8080" ``
http://<alinux-vpc-ip>:8080/infer (embedding) and /rank (ranking).opensearch-deploy-model): Create an index with dense_vector mapping matching your model’s output dimension:``json PUT /airec_semantic_index { "mappings": { "properties": { "item_id": { "type": "keyword" }, "embedding": { "type": "dense_vector", "dims": 768, "index": true, "similarity": "cosine" } } } } ``
``bash curl -X POST "http://<opensearch-host>:9200/airec_semantic_index/_bulk" -H "Content-Type: application/json" -d @batch_vectors.json ``
airec-deploy-service): Initialize the recommendation instance:``bash aliyun airec CreateInstance --InstanceType standard --RegionId cn-hangzhou --Name "custom-semantic-rec" ``
/rank endpoint as a custom ranking service via aliyun airec UpdateDataSource --InstanceId <id> --Config '{"recall":"opensearch","rank":"http://<alinux-ip>:8080/rank"}'.SemanticRecall (OpenSearch) -> CustomRank (ALinux) -> DiversityFilter -> Serve. Validate with aliyun airec CreatePipeline --InstanceId <id> --Config pipeline_v1.json.Item metadata flows through the ALinux-hosted embedding model, generating vectors stored in OpenSearch. At query time, AIRec triggers semantic recall against OpenSearch, retrieves candidate items, and passes them to the ALinux ranking endpoint for personalized scoring. AIRec handles business logic, diversity filtering, and final response formatting, while ALinux and OpenSearch handle heavy ML inference and vector search respectively.
AliyunAIRECFullAccess, AliyunOpenSearchFullAccess, AliyunECSFullAccessdims must exactly match your model’s output vector size; otherwise, _bulk ingestion fails with mapper_parsing_exception.item_id, user_id, and behavior fields; missing or malformed fields cause pipeline execution to silently drop candidates.Q: How do I deploy a full-stack recommendation pipeline with custom models and semantic search? A: You can deploy this architecture by training and serving custom embedding or ranking models on Alibaba Cloud Linux instances, indexing those embeddings in OpenSearch for vector-based retrieval, and layering AIRec on top to orchestrate the complete recommendation pipeline. This cross-product combination integrates Airec, OpenSearch, and Alibaba Cloud Linux to deliver higher-quality personalized recommendations through unified custom model inference and semantic search.