# pai-benchmark

Part of **PAI**

# Platform for AI (PAI) Benchmarking

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Create Benchmark Task | Synchronous | Creates a stress testing task for evaluating the performance of a service. The API allows configuring basic settings, service parameters, test data, HTTP request details, and optional advanced configurations. |
| Query Benchmark Task Report | Synchronous | Queries the report of a stress testing task in the Platform for AI (PAI) service. This API allows users to retrieve either a raw data report or a formatted report URL based on the specified report type. |

## API Calling Patterns

### Authentication
The primary authentication method is **Bearer Token** via the `Authorization` header.

- Use the header: `Authorization: Bearer <your_api_key>`
- Store your API key in the environment variable: `DASHSCOPE_API_KEY`
- Example: `export DASHSCOPE_API_KEY=sk-xxxxxx`

While other Alibaba Cloud authentication methods exist (e.g., AccessKey), the Bearer token with `DASHSCOPE_API_KEY` is recommended for PAI Benchmarking APIs based on official examples.

### Service Endpoint
The APIs use region-specific endpoints under the Alibaba Cloud API gateway:

- Base pattern: `https://api.aliyun.com/api/eas/2021-07-01/{Action}`
- For international regions: `https://api.alibabacloud.com/api/eas/2021-07-01/{Action}`

Common regions include:
- `cn-shanghai`
- `cn-hangzhou`
- `cn-beijing`

Note: The `ClusterId` parameter in query operations corresponds to the region ID (e.g., `cn-shanghai`).

### Synchronous API Pattern
Both available functions use a **synchronous** calling pattern:

1. **CreateBenchmarkTask**: Send a `POST` request with a JSON body containing task configuration. The response includes the task name and status message immediately.
2. **DescribeBenchmarkTaskReport**: Send a `GET` request with query parameters (`ClusterId`, `TaskName`, `ReportType`). The full report or report URL is returned directly in the response.

No polling or async handling is required—responses are immediate upon successful processing.

## Parameter Reference

### Create Benchmark Task

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| body | string | false | — | — | The request body. The body includes the parameters that are set to create a stress testing task. |

> **Note**: The actual structure is a JSON object passed as the request body. Key top-level fields include:
> - `base.duration`: Test duration in seconds (e.g., 600)
> - `service.serviceName`: Name of the target service
> - `service.requestToken`: Authentication token for the service
> - `data.path`: OSS URL to test data (e.g., binary or TFRecord files)
> - `data.dataType`: Format of test data (e.g., `binary`)
> - `optional.maxRt`: Maximum allowed response time in milliseconds

### Query Benchmark Task Report

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| ClusterId | string | true | — | — | The ID of the region where the stress testing task is performed. |
| TaskName | string | true | — | — | The name of the stress testing task. |
| ReportType | string | false | — | one of: RAW, Report | The report type of the stress testing task. |

## Code Examples

### Create Benchmark Task - JSON - All Regions

```json
{
  "base": {
    "duration": 600
  },
  "service": {
    "serviceName": "test_service",
    "requestToken": "test_token"
  },
  "data": {
    "path": "https://larec-benchmark-cd.oss-cn-chengdu.aliyuncs.com/youbei/sv_dbmtl/data/youbei.warmup.tf.bin",
    "dataType": "binary"
  },
  "optional": {
    "maxRt": 100
  }
}
```

### Query Formatted Report - Bash - All Regions

```bash
curl -X GET 'https://api.aliyun.com/api/eas/2021-07-01/DescribeBenchmarkTaskReport?ClusterId=cn-shanghai&TaskName=benchmark-larec-test-015d&ReportType=report' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY'
```

### Query Raw Report Data - Python - All Regions

```python
import requests

url = "https://api.aliyun.com/api/eas/2021-07-01/DescribeBenchmarkTaskReport"
params = {
    "ClusterId": "cn-shanghai",
    "TaskName": "benchmark-larec-test-015d",
    "ReportType": "RAW"
}
headers = {
    "Authorization": "Bearer $DASHSCOPE_API_KEY"
}

response = requests.get(url, params=params, headers=headers)
print(response.json())
```

### Create Task via HTTP POST - Bash - China Region

```bash
curl -X POST 'https://api.aliyun.com/api/eas/2021-07-01/CreateBenchmarkTask' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
  "base": {"duration": 300},
  "service": {"serviceName": "my-model-service", "requestToken": "abc123"},
  "data": {"path": "https://my-bucket.oss-cn-shanghai.aliyuncs.com/test-data.bin", "dataType": "binary"}
}'
```

## Response Format

```json
{
  "RequestId": "40325405-579C-4D82********",
  "TaskName": "benchmark-larec-test-1076",
  "Region": "cn-shanghai",
  "Message": "Benchmark task [foo] is Creating"
}
```

**Key Fields**:
- `RequestId` — Unique identifier for the API request; useful for troubleshooting
- `TaskName` — Name of the created benchmark task; used in subsequent report queries
- `Region` — Region where the task is running
- `Message` — Human-readable status message about task creation

For report queries, the response includes:
- `ReportUrl` — Direct URL to an HTML-formatted performance report (when `ReportType=Report`)
- `Data` — JSON-encoded raw metrics including QPS, RT percentiles (TP50, TP99, etc.), status code counts, and traffic statistics (when `ReportType=RAW`)

## Error Handling

| Error Code | Description | Recommended Action |
|------------|-------------|---------------------|
| 400 | Bad Request: The request parameters are invalid or missing required fields. | Validate all required parameters and ensure correct data types and formats. |
| 403 | Forbidden: The user does not have sufficient permissions to perform this operation. | Verify that your API key has the necessary permissions for PAI-EAS benchmarking. |
| 404 | The specified benchmark task was not found. Verify the ClusterId and TaskName. | Double-check the `ClusterId` (region) and `TaskName`; ensure the task exists and is completed. |
| 429 | Too Many Requests: The request rate exceeds the allowed limit. Wait and retry after a delay. | Implement exponential backoff; respect the 100 QPS account limit. |
| 500 | Internal Server Error: An unexpected error occurred on the server side. Retry the request. | Retry with exponential backoff; contact support if persistent. |
| 503 | Service Unavailable: The service is temporarily unavailable. Retry later. | Wait and retry after a few seconds; check service status. |

### Rate Limits & Retry
- **Rate limit**: 100 QPS per account
- **Retry strategy**: Use exponential backoff (e.g., 1s, 2s, 4s, 8s delays)
- If a `429` error occurs, respect any `Retry-After` header if present, or wait at least 1 second before retrying

## Environment Requirements

- Set your API key: `export DASHSCOPE_API_KEY=your_api_key_here`
- Required packages (for Python examples): `requests` (`pip install requests`)
- No specific runtime version constraints are documented

## FAQ

Q: How do I find the name of a benchmark task I just created?
A: The `CreateBenchmarkTask` API returns a `TaskName` field in the response. Save this value to query the report later.

Q: What regions support benchmarking?
A: Benchmarking is available in major Alibaba Cloud regions including `cn-shanghai`, `cn-hangzhou`, and `cn-beijing`. Use the region ID as the `ClusterId` when querying reports.

Q: Can I use my own test data?
A: Yes. Provide an OSS URL in the `data.path` field and specify the format (e.g., `binary`, `text`) in `data.dataType`.

Q: Why am I getting a 404 when querying a report?
A: Ensure the task has completed successfully and that you’re using the correct `ClusterId` (region) and exact `TaskName`. Reports are only available after task completion.

Q: Is there a maximum test duration?
A: The documentation does not specify a hard limit, but very long durations may be subject to platform quotas or cost controls. Start with durations like 300–600 seconds for initial tests.

## Pricing & Billing

*No pricing information was extracted from the documentation. Please consult the official Alibaba Cloud PAI pricing page for up-to-date billing details.*