# es-model-deployment

Part of **ES**

# Elasticsearch Model Deployment Console Guide

## Operations Overview

| Operation | Console Entry | Prerequisites | Description |
|----------|---------------|---------------|-------------|
| Deploy Model Service | Console > AI Search Open Platform > Model Service > Service Deployment | - An AI Search Open Platform account<br>- (RAM users only) The Model Service-Service Deployment permission granted to your RAM user | Publish custom models as API endpoints through the console. |

## Operation Steps

### Deploy Model Service

**Navigation**: Console > AI Search Open Platform > Model Service > Service Deployment

**Prerequisites**:
- An AI Search Open Platform account
- (RAM users only) The Model Service-Service Deployment permission granted to your RAM user

1. Go to **Model Service > Service Deployment** and click the **Deploy Service** button.
   - Element: **Deploy Service** (button) — located in the top-right corner of the Service Deployment page
   - Notes: A new configuration form will appear.

2. In the deployment form, fill in the **Service name**.
   - Element: **Service name** (text_input) — in the main content area
   - Notes: This name identifies your deployment and must be unique within your account.

3. Select the **Deployment region** from the dropdown.
   - Element: **Deployment region** (dropdown) — in the main content area
   - Notes: Only **Germany (Frankfurt)** is currently supported as the deployment region.

4. Choose a **Resource type** from the dropdown.
   - Element: **Resource type** (dropdown) — in the main content area
   - Notes: This determines the compute resources allocated for model inference.

5. Review the **Estimated price** displayed below the form fields.
   - Element: **Estimated price** (text_input) — read-only field in the main content area
   - Notes: The price updates automatically based on your selected configuration.

6. Click the **Deploy** button to start provisioning the service.
   - Element: **Deploy** (button) — at the bottom of the Deploy Service form
   - Notes: After clicking, the system begins deployment. Wait for the service status to change from "Deploying" to "Normal".

7. Once deployment completes, locate your service in the service list and click **Manage** to view service details.
   - Element: **Manage** (button) — in the service status row of the service list
   - Notes: Available actions depend on the current service status (e.g., Normal, Deploying, Deployment Failed). From this page, you can retrieve your Service ID, API endpoint, and credentials.

| Parameter | Type | Required | Options/Values | Description |
|-----------|------|----------|----------------|-------------|
| Service name | text_input | Yes | — | A name to identify this deployment |
| Deployment region | dropdown | Yes | Germany (Frankfurt) | The region where the service runs. Currently, only Germany (Frankfurt) is supported. |
| Resource type | dropdown | Yes | — | The compute resource type for model inference |
| Estimated price | text_input | No | — | The estimated cost for the selected configuration |

## FAQ

Q: Where can I find my deployed service’s API endpoint and credentials?
A: After deployment, go to the service list, click **Manage** next to your service, and view the details page. The public/private API endpoint, Service ID, API Key, and Token are displayed there.

Q: Can I change the deployment region after creating the service?
A: No. The deployment region is fixed at creation time and cannot be modified afterward. You would need to deploy a new service in the desired region.

Q: What permissions does a RAM user need to deploy a model service?
A: The RAM user must have the **Model Service-Service Deployment** permission. Without it, the **Deploy Service** button may be disabled or the operation will fail with a 403 error.

Q: Why is only Germany (Frankfurt) available as a deployment region?
A: As per current platform support, Germany (Frankfurt) is the only region enabled for model service deployment in the AI Search Open Platform.

Q: How long does deployment typically take?
A: Deployment usually completes within a few minutes. The service status will update from "Deploying" to "Normal" when ready. If it fails, check the error message and retry.

## Pricing & Billing

### Billing Model
Billing occurs per request. Costs are calculated based on input and output token counts. Minimum charge is 100 tokens per request.

### Price Reference

| Tier | Input Price | Output Price |
|------|-------------|--------------|
| text_vectorization | 0.002 /tokens | 0.002 /tokens |
| reranking | 0.005 /tokens | 0.005 /tokens |
| multimodal_vector | 0.01 /tokens | 0.01 /tokens |

### Free Tier
None

### Billing Notes
- Billing is applied per request.
- Each request is charged for at least 100 tokens, even if fewer are used.
- Rate limit: 100 QPS per service. Exceeding this triggers a 429 error.