# pai-model

Part of **PAI**

<!-- intent-backlink:auto -->

> 💡 **Path Selection**: This skill is one implementation path for [Deploy a model for online inference](../../intent/pai-deploy-inference/SKILL.md). If you're unsure which path to take, check the routing skill first.

# Platform for AI (PAI) Model Management

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Manage Model | Synchronous | Create, delete, update, get details, or list machine learning models. |
| Manage Model Version | Synchronous | Create, delete, update, or get details of model versions. |
| Manage Model Label | Synchronous | Create or delete labels for models and model versions. |
| Manage Model Versions | Synchronous | Create and manage different versions of machine learning models. |
| Get Model Details | Synchronous | Retrieve detailed information about a specific model. |
| Manage Model Templates | Synchronous | Create and configure reusable model templates. |
| Configure Models | Synchronous | Set up model configurations for deployment or training. |
| Deploy Model | Synchronous | Deploy a machine learning model. |
| Manage Inference Jobs | Synchronous | Create inference jobs. |
| Manage Online Evaluation Tasks | Synchronous | Create, delete, get, list, stop, and update online evaluation tasks for models. |
| Configure Evaluation | Synchronous | Set up evaluation templates, data extraction, and model configurations for evaluation. |
| Get TensorBoard Shared URL | Synchronous | Retrieve a shared access URL for a TensorBoard instance. |
| Start TensorBoard Instance | Synchronous | Start a TensorBoard instance for model development purposes. |

## API Calling Mode

### Authentication
Use Bearer Token authentication with your DashScope API key.

- Header format: `Authorization: Bearer <your_api_key>`
- Environment variable: `DASHSCOPE_API_KEY`
- Set this header on all requests to PAI Model Management APIs

### Service Endpoint
APIs use region-specific endpoints following this pattern:

`https://api.aliyun.com/api/{service}/{version}` (China regions)  
`https://api.alibabacloud.com/api/{service}/{version}` (International regions)

Common regions include:
- `cn-hangzhou`
- `cn-shanghai`
- `cn-beijing`

Service names vary by function:
- `AIWorkSpace` for model management
- `PaiLLMTrace` for evaluation tasks
- `pai-dlc` for TensorBoard operations

### Synchronous API Pattern
All PAI Model Management APIs follow a synchronous request-response pattern:

1. Send an HTTP request (GET, POST, PUT, or DELETE) to the appropriate endpoint
2. Include required parameters in the URL query string, path, or request body
3. Add the `Authorization: Bearer $DASHSCOPE_API_KEY` header
4. Receive an immediate JSON response with results or status

No polling or async handling is required—responses are returned directly.

## Parameter Reference

### Manage Model

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| ModelName | string | true | | 1 to 127 characters in length | The name of the model. |
| ModelDescription | string | false | | | The description of the model. |
| WorkspaceId | string | false | | | The ID of the workspace. |
| Accessibility | string | false | PRIVATE | one of: PRIVATE, PUBLIC | The visibility of the model in the workspace. |
| Origin | string | false | | | The source of the model (e.g., ModelScope, HuggingFace). |
| Domain | string | false | | | The domain (e.g., nlp, cv). |
| Task | string | false | | | The task (e.g., text-classification). |
| ModelDoc | string | false | | | The model documentation. |
| OrderNumber | integer | false | | | The ordinal number for custom sorting. |
| ModelType | string | false | | | The model type (e.g., Checkpoint, LoRA). |
| ExtraInfo | object | false | | | Other information about the model. |
| ParameterSize | integer | false | | | The number of parameters, in millions. |
| Tag | array | false | | | A list of tags. |

### Manage Model Version

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| ModelId | string | true | | | The model ID. |
| VersionName | string | false | | | The model version name. |
| Uri | string | true | | | The URI of the model version (OSS or HTTP). |
| VersionDescription | string | false | | | The description of the model version. |
| FormatType | string | false | | one of: OfflineModel, SavedModel, Keras H5, Frozen Pb, Caffe Prototxt, TorchScript, XGBoost, PMML, AlinkModel, ONNX | The format of the model. |
| FrameworkType | string | false | | one of: Pytorch, XGBoost, Keras, Caffe, Alink, Xflow, TensorFlow | The framework of the model. |
| Options | string | false | {} | | Extended fields as a JSON string. |
| Metrics | object | false | | max length 8192 bytes | The model metrics. |
| TrainingSpec | object | false | {} | | Training configurations for fine-tuning. |
| InferenceSpec | object | false | {} | | Downstream inference configurations. |
| SourceType | string | false | Custom | one of: Custom, PAIFlow, TrainingService | The source type of the model. |
| SourceId | string | false | | | The source ID with specific format if not Custom. |
| ApprovalStatus | string | false | | one of: Pending, Approved, Rejected | The approval status. |

### Manage Online Evaluation Tasks

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| TaskName | string | false | | | The task name. |
| AppName | string | false | | | The name of the user application. |
| StartTime | string | false | | | The start time of trace data (UTC). |
| EndTime | string | false | | | The end time of trace data (UTC). |
| SamplingFrequencyMinutes | integer | false | | | Time window width for input data search. |
| Description | string | false | | | The description of the task. |
| Filters | array<object> | false | | | Search filter conditions for trace data. |
| SamplingRatio | integer | false | | | Percentage of data used as evaluation input. |
| EvaluationConfig | EvaluationConfig | false | | | JSON paths to extract values from trace data. |
| ModelConfig | ModelConfig | false | | | Access configuration for the evaluation model. |

### TensorBoard Operations

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| TensorboardId | string | true | | | The ID of the TensorBoard task. |
| ExpireTimeSeconds | string | false | | max 604800 | Validity period of the shareable link (seconds). |
| WorkspaceId | string | false | | | The workspace ID. |

## Code Examples

### Create Model - curl - all

```bash
curl -X POST https://api.aliyun.com/api/AIWorkSpace/2021-02-04/CreateModel \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "ModelName": "Sentiment analysis",
  "ModelDescription": "General sentiment analysis.",
  "WorkspaceId": "234**",
  "Accessibility": "PUBLIC",
  "Origin": "ModelScope",
  "Domain": "nlp",
  "Task": "text-classification"
}'
```

### List Model Versions - python - all

```python
import requests

url = "https://api.alibabacloud.com/api/AIWorkSpace/2021-02-04/ListModelVersions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
params = {
    "ModelId": "model-dajbueh******",
    "PageNumber": 1,
    "PageSize": 10,
    "Order": "DESC",
    "SortBy": "GmtCreateTime",
    "VersionName": "1.0.1",
    "FormatType": "SavedModel",
    "FrameworkType": "TensorFlow",
    "SourceType": "PAIFlow",
    "ApprovalStatus": "Approved"
}

response = requests.get(url, headers=headers, params=params)
print(response.json())
```

### Delete Model Labels - python - all

```python
import requests

url = "https://api.aliyun.com/api/AIWorkSpace/2021-02-04/models/model-d8dfd****sjfd/labels"
headers = {
    "Authorization": "Bearer $DASHSCOPE_API_KEY"
}
params = {
    "LabelKeys": "key1,key2"
}

response = requests.delete(url, headers=headers, params=params)
print(response.json())
```

### Get TensorBoard Shared URL - bash - all

```bash
curl -X GET 'https://api.aliyun.com/api/pai-dlc/2020-12-03/GetTensorboardSharedUrl?TensorboardId=tbxxxxxx&ExpireTimeSeconds=86400' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY'
```

### Create Online Evaluation Task - curl - all

```bash
POST /api/v1/PAILLMTrace/onlineevaltasks HTTP/1.1
Host: api.alibabacloud.com
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "TaskName": "my-llm-app-eval-task-1",
  "AppName": "my-best-llm-app",
  "StartTime": "2025-04-05 14:00:01",
  "EndTime": "2025-06-05 14:00:01",
  "SamplingFrequencyMinutes": 9,
  "Description": "April to June data assessment",
  "SamplingRatio": 50,
  "Filters": [
    {
      "Key": "ServiceId",
      "Operator": "=",
      "Value": "foo"
    }
  ]
}
```

### Update Model Version - curl - all

```bash
curl -X PUT https://api.aliyun.com/api/AIWorkSpace/2021-02-04/models/model-dfs1****5c/versions/0.1.0 \
  -H "Authorization: Bearer <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "VersionDescription": "通用情感分析。",
    "Metrics": {
      "Results": [
        {
          "Dataset": {
            "DatasetId": "d-sdkjanksaklerhfd"
          },
          "Metrics": {
            "cer": 0.175
          }
        },
        {
          "Dataset": {
            "Uri": "oss://xxxx/"
          },
          "Metrics": {
            "cer": 0.172
          }
        }
      ]
    },
    "InferenceSpec": {
      "processor": "tensorflow_gpu_1.12"
    },
    "SourceType": "PAIFlow",
    "SourceId": "region=cn-shanghai,workspaceId=13**,kind=PipelineRun,id=run-sakdb****jdf",
    "ApprovalStatus": "Approved"
  }'
```

## Response Format

```json
{
  "RequestId": "9DAD3112-AE22-5563-9A02-5C7E8****E35",
  "ModelId": "model-rbvg5wzljz****ks92"
}
```

**Key Fields**:
- `RequestId` — Unique identifier for the API request
- `ModelId` — Unique identifier for the created model
- `VersionName` — Name of the created model version
- `TensorboardSharedUrl` — Shareable URL for TensorBoard access

## Error Handling

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|-------------------|---------------------------|----------------------------------------|
| 400 | Bad Request: The request parameters are invalid or missing. | Verify all required parameters are present and correctly formatted. |
| 401 | Unauthorized: The API key or credentials are invalid or not authorized. | Check that your DASHSCOPE_API_KEY is valid and properly set. |
| 403 | Forbidden: The user does not have sufficient permissions to perform this operation. | Ensure your account has the required RAM permissions for the operation. |
| 404 | Not Found: The specified workspace, model, or resource does not exist. | Verify the IDs you're using exist and are accessible to your account. |
| 429 | Too Many Requests: The request rate exceeds the allowed limit. | Implement rate limiting in your client or wait before retrying. |
| 500 | Internal Server Error: An unexpected error occurred on the server side. | Retry the request after a short delay; contact support if persistent. |
| 503 | Service Unavailable: The service is temporarily unavailable. | Wait and retry later; the service may be undergoing maintenance. |

## Environment Requirements

- Set your API key as an environment variable: `export DASHSCOPE_API_KEY=your_key_here`
- Use standard HTTP clients (curl, requests, etc.)—no special SDK required
- Ensure your system clock is synchronized for proper request signing

## FAQ

Q: How do I authenticate my requests to the PAI Model Management API?
A: Include the header `Authorization: Bearer $DASHSCOPE_API_KEY` in all requests, where `DASHSCOPE_API_KEY` is your DashScope API key stored as an environment variable.

Q: What model formats and frameworks are supported?
A: Supported formats include SavedModel, ONNX, TorchScript, PMML, and more. Supported frameworks include TensorFlow, PyTorch, Keras, XGBoost, Caffe, and Alink.

Q: How can I organize my models effectively?
A: Use labels and tags to categorize models, set appropriate domains (nlp, cv) and tasks (text-classification), and leverage workspaces for team collaboration.

Q: Are there any free operations available?
A: Yes, some operations like DeleteModel are free, while others have monthly free tiers (typically 100-1000 requests per month).

Q: How do I handle model versioning?
A: Each model can have multiple versions with unique names following semantic versioning (e.g., 0.1.0). Use CreateModelVersion to add new versions and ListModelVersions to view them.

## Pricing & Billing

### Billing Model
Per-request billing—each API call counts as one request regardless of success or failure.

### Price Reference

| Tier | Input Price | Output Price |
|------|-------------|--------------|
| CreateModel | 0.001 / |
| GetModel | 0.0001 / | 0.0001 / |
| ListModels | 0.0001 / | 0.0001 / |
| UpdateModel | 0.001 / | 0.001 / |
| CreateModelVersion | 0.001 / |
| GetModelVersion | 0.001 / |
| ListModelVersions | 0.0001 / | 0.0001 / |
| DeleteModelVersion | 0.001 / |
| CreateModelLabels | 0.001 / | 0.001 / |
| DeleteModelLabels | 0.001 / |
| GetOnlineEvalTask | 0.001 / | 0.002 / |
| ListOnlineEvalTasks | 0.0001 / | 0.0002 / |
| GetTensorboardSharedUrl | 0.001 / | 0.001 / |
| StartTensorboard | 0.001 / |

### Free Tier
- DeleteModel: Free with no additional charges
- Most other operations: 100-1000 free requests per month
- GetModel and ListModels: 1000 free calls per month

### Usage Limits
- QPS limits range from 10-100 requests per second depending on the API
- Single request size limits: 8192 characters for metrics and certain fields
- TensorBoard shared URLs: Maximum validity of 604800 seconds (7 days)

### Billing Notes
- Failed requests are still counted toward your quota and billed
- Pricing is consistent across China and international regions
- No minimum charges—billing is strictly per-request