# pai-text

Part of **PAI**

# Platform for AI (PAI) AI Workloads

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Generate Text | Synchronous | Generate text content using PAI's text generation capabilities. |
| Access PAAI Service Endpoints | Synchronous | Connect to PAI text generation service endpoints. |
| Generate Video | Synchronous | Generate videos using AI-powered video generation algorithms through API calls. |
| Configure Recommendation Algorithm | Synchronous | Programmatically configure recommendation algorithms through API calls. |
| Evaluate Recall Hit Rate | Synchronous | Evaluate the recall and hit rate performance of vector retrieval systems. |
| Auto ARIMA Model Selection | Synchronous | Automatically select optimal ARIMA model parameters for time series forecasting. |
| Evaluate Traces | Synchronous | Perform trace evaluation and retrieve evaluation results. |
| Access LLM Trace API | Synchronous | Use the functional listing to access LLM trace APIs. |
| List Traces Data | Synchronous | Retrieve trace data records. |

## API Calling Modes

### Authentication
The primary authentication method is **Bearer Token**.

- Include the header: `Authorization: Bearer <your_api_key>`
- Store your credential in the environment variable: `DASHSCOPE_API_KEY`
- Some components (e.g., MaxCompute-based workflows like `x13_auto_arima` and `hitrate_gl_ext`) do not require explicit authentication but run within Alibaba Cloud projects with implicit permissions.

### Service Endpoint
Most PAI AI Workloads APIs use a global base URL pattern:

```text
https://api.alibabacloud.com/api/PaiLLMTrace/2024-03-11/{Operation}
```

For regional service endpoints (used by text generation, recommendation, and video services), the pattern is:

```text
https://{service}.{region}.aliyuncs.com
```

Common regions include:
- `cn-hangzhou`
- `cn-shanghai`
- `cn-beijing`
- `cn-hangzhou`
- `ap-southeast-1`

Examples:
- `https://pai.cn-hangzhou.aliyuncs.com`
- `https://pairecservice.cn-shanghai.aliyuncs.com`
- `https://eflo-cnp.cn-wulanchabu.aliyuncs.com`

### Synchronous API Pattern
All functions in this domain use **Synchronous** calling mode:
1. Send a single HTTP request (POST, GET, or PUT) to the endpoint
2. Receive a complete JSON response immediately
3. No polling or streaming is required

For trace-related APIs (`ListTracesDatas`, `EvaluateTrace`, `ListEvalResults`), use standard REST methods with query parameters or JSON bodies.

For analytics components (`x13_auto_arima`, `hitrate_gl_ext`), submit jobs using PAI command syntax (typically via MaxCompute), which execute synchronously in the backend.

## Parameter Reference

### Evaluate Recall Hit Rate

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| item_emb_table | string | true | — | — | The item embedding table. |
| true_seq_table | string | true | — | — | The ground truth table. For u2i recalls: users and user-relevant items. For i2i recalls: items and item-relevant items. |
| user_emb_table | string | false | — | — | The user embedding table. Required only for u2i recalls. |
| total_hitrate | string | true | — | — | Output table for total hit rate values. |
| hitrate_details | string | true | — | — | Output table for per-trigger hit rate details. |
| recall_type | string | true | — | one of: u2i, i2i | Recall type: u2i or i2i. |
| emb_dim | int | true | — | — | Embedding dimension of the embedding table. |
| k | int | true | — | — | Number of items to recall (top K). |
| metric | int | false | 1 | range 0-1 | Similarity metric. 0 uses L2 distance and returns the top K items with the shortest distance. 1 uses inner products and returns the top K items with the greatest inner product values. |
| strict | bool | false | False | — | When True, computes similarity without approximation. This eliminates minor deviations in the hit rate calculation but significantly increases computation time. |
| lifecycle | int | false | 7 | range 1-365 | Retention period for output tables, in days. |
| batch_size | int | false | 1024 | min 1 | Number of samples processed per batch. Reduce this value if workers run out of memory. |
| worker_count | int | false | 1 | min 1 | Number of workers. Increase this value for large input tables or when a single worker is not fast enough. |
| worker_memory | int | false | 20000 | min 1000 | Memory allocated to each worker, in MB. |

### Auto ARIMA Model Selection

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| inputTableName | string | true | N/A | — | Name of the input table. |
| seqColName | string | true | N/A | — | Time series column. Sorts values in the value column. |
| valueColName | string | true | N/A | — | Value column. |
| groupColNames | string | false | N/A | — | Stratification columns. Separate multiple columns with commas (,), such as col0,col1. Creates a separate time series for each group. |
| start | string | false | 1.1 | Format: year.seasonal | Start time of the time series. Format: year.seasonal, such as 1986.1. |
| frequency | integer | false | 12 | Positive integer in the range (0,12] | Time series frequency. A value of 12 indicates 12 months (one year). |
| maxOrder | integer | false | 2 | Positive integer in the range [0,4] | Maximum values of p and q. |
| maxSeasonalOrder | integer | false | 1 | Positive integer in the range [0,2] | Maximum values of seasonal p and q. |
| maxDiff | integer | false | 2 | Positive integer in the range [0,2] | Maximum non-seasonal differencing order d. |
| maxSeasonalDiff | integer | false | 1 | Positive integer in the range [0,1] | Maximum seasonal differencing order d. |
| diff | integer | false | -1 | Positive integer in the range [0,2] or -1 | Non-seasonal differencing order d. If both diff and maxDiff are specified, maxDiff is ignored. |
| seasonalDiff | integer | false | -1 | Positive integer in the range [0,1] or -1 | Seasonal differencing order d. If both seasonalDiff and maxSeasonalDiff are specified, maxSeasonalDiff is ignored. |
| predictStep | integer | false | 12 | Positive integer in the range (0,365] | Number of prediction entries. |
| confidenceLevel | double | false | 0.95 | Number in the range (0,1) | Confidence level for prediction intervals. |
| outputPredictTableName | string | true | N/A | — | Output prediction table. |
| outputDetailTableName | string | true | N/A | — | Output details table. |
| coreNum | integer | false | Determined by the system | Positive integer | Number of cores. Used with memSizePerCore. |
| memSizePerCore | integer | false | Determined by the system | Positive integer in the range [1024, 65536] | Memory size per core, in MB. |

### List Traces Data

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| MinTime | string | true | (current time - 2 days) | UTC format: YYYY-mm-dd or YYYY-MM-DD HH:mm:ss | The lower limit of the search time range. |
| MaxTime | string | false | (current time +10 minutes) | UTC format: YYYY-mm-dd or YYYY-MM-DD HH:mm:ss | The upper limit of the search time range. |
| TraceIds | array | false | — | — | The list of trace IDs. |
| SpanIds | array | false | — | — | The list of span IDs. |
| PageNumber | integer | false | 1 | range 1 to infinity | The page number. |
| PageSize | integer | false | 20 | max 100 | The number of entries per page. |
| LlmAppName | string | false | "" | alphanumeric, dot, hyphen, underscore | Exact match for resources.service.app.name. |
| OpentelemetryCompatible | boolean | false | False | one of: True, False | Whether returned JSON is OpenTelemetry-compatible. |
| HasStatusMessage | boolean | false | False | one of: True, False | Return only traces with non-empty statusMessage in any span. |
| HasEvents | boolean | false | False | one of: True, False | Return only traces with non-empty events in any span. |
| OwnerSubId | string | false | "" | alphanumeric, dot, hyphen, underscore | Value of resources.service.owner.sub_id. |
| EndUserId | string | false | "" | alphanumeric, dot, hyphen, underscore | Value of attributes.service.app.user_id. |
| TraceReduceMethod | string | false | "" | one of: REMOVE_EMBEDDING, ROOT_ONLY, blank | Content simplification method to reduce data volume. |
| SortBy | string | false | "" | one of: StartTime, Duration | Field used to sort results. |
| SortOrder | string | false | DESC | one of: ASC, DESC | Sorting order. |

### Evaluate Traces

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| TraceId | string | true | — | — | The trace ID. |
| EvaluationConfig | EvaluationConfig | true | — | — | Configuration for evaluation logic. |
| EvaluationId | string | false | — | — | The ID of the evaluation task. If not specified, the system generates one. |
| AppName | string | false | — | — | The name of the application to which the trace belongs. |
| MinTime | string | false | — | — | The start time of the search time range, in UTC format. |
| MaxTime | string | false | — | — | The end time of the search time range, in UTC format. |
| ModelConfig | ModelConfig | false | — | — | Configuration to access the internal evaluation model. |

## Code Examples

### List Trace Data - Python - All Regions

```python
import requests

url = "https://api.alibabacloud.com/api/PaiLLMTrace/2024-03-11/ListTracesDatas"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}
params = {
    "MinTime": "2024-01-31",
    "MaxTime": "2024-12-31 23:59:59",
    "PageSize": 10
}

response = requests.get(url, headers=headers, params=params)
print(response.json())
```

### List Trace Data - Bash - All Regions

```bash
curl -X GET \
  'https://api.alibabacloud.com/api/PaiLLMTrace/2024-03-11/ListTracesDatas?MinTime=2024-01-31&MaxTime=2024-12-31+23:59:59&PageSize=10' \
  -H 'Authorization: Bearer YOUR_API_KEY'
```

### Evaluate Recall Hit Rate - Bash - All Regions

```bash
pai -name hitrate_gl_ext \
    -Ditem_emb_table='item_emb_table' \
    -Duser_emb_table='user_emb_table' \
    -Dtrue_seq_table='true_seq_table' \
    -Dhitrate_details='hitrate_details' \
    -Dtotal_hitrate='total_hitrate' \
    -Drecall_type='u2i' \
    -Dk=5 \
    -Demb_dim=10 \
    -Dmetric=1 \
    -Dstrict=False \
    -Dbatch_size=1024 \
    -Dworker_count=1 \
    -Dworker_memory=20000 \
    -Dlifecycle=7;
```

### Auto ARIMA Model Selection - SQL - All Regions

```sql
PAI -name x13_auto_arima \
    -project algo_public \
    -DinputTableName=pai_ft_x13_arima_input \
    -DseqColName=id \
    -DvalueColName=number \
    -Dstart=1949.1 \
    -Dfrequency=12 \
    -DmaxOrder=4 \
    -DmaxSeasonalOrder=2 \
    -DmaxDiff=2 \
    -DmaxSeasonalDiff=1 \
    -DpredictStep=12 \
    -DoutputPredictTableName=pai_ft_x13_arima_auto_out_predict \
    -DoutputDetailTableName=pai_ft_x13_arima_auto_out_detail
```

### List Evaluation Results - Curl - All Regions

```bash
curl -X GET 'https://api.alibabacloud.com/api/PaiLLMTrace/2024-03-11/ListEvalResults' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY' \
-d '{"EvaluationId":"0bb05ae2a2dc11ef9757faaa2a1ec0c6","PageSize":10}'
```

## Response Format

```json
{
  "RequestId": "6A87228C-969A-1381-98CF-AE07AE630FA5",
  "Code": "ExecutionFailure",
  "Message": "failed to get trace data",
  "TotalCount": 22,
  "Traces": [
    "open telemetry compatible:\n{...}",
    "open telemetry incompatible:\n[{...}]"
  ]
}
```

**Key Fields**:
- `RequestId` — Unique identifier for the API request (useful for debugging)
- `TotalCount` — Total number of trace records matching the query
- `Code` — Error code if the request failed (e.g., `ExecutionFailure`)
- `Message` — Human-readable error description
- `Traces` — Array of trace records; format depends on `OpentelemetryCompatible` flag

## Error Handling

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|-------------------|----------------------------|----------------------------------------|
| ExecutionFailure | An internal error occurred while retrieving the evaluation results. Check the request parameters and retry. | Verify parameters and retry; contact support if persistent. |
| InvalidInputParams | Indicates that the input parameters are invalid. The message provides more detail about the specific issue, such as missing dataset ID or time range. | Ensure all required fields are provided and correctly formatted. |
| 400 | Invalid request parameters. Check the input values and ensure they conform to the expected format. | Validate date formats, parameter types, and allowed values. |
| 401 | Authentication failed. Verify your API key or credentials are valid and properly configured. | Check that `DASHSCOPE_API_KEY` is set and correct. |
| 403 | Insufficient permissions. Ensure the RAM user or role has the necessary permissions (paillmtrace:ListTracesDatas) to access this operation. | Assign the required RAM policy to your user or role. |
| 429 | Too many requests. You have exceeded the rate limit. Wait before retrying or contact support to increase limits. | Implement exponential backoff or reduce request frequency. |
| 500 | Internal server error. Try again later. If the issue persists, contact Alibaba Cloud support. | Retry after a delay; escalate if unresolved. |

### Rate Limits & Retry
- **Trace APIs**: 100 QPS per account
- **Auto ARIMA**: Maximum 1,200 data records per group
- **Video Generation**: 10 requests per minute

When encountering `429` errors, implement exponential backoff with jitter. Respect the `Retry-After` header if provided (though not explicitly documented, it may be sent during throttling).

## Environment Requirements

- Set your API key: `export DASHSCOPE_API_KEY=your_key_here`
- For PAI command-line components (`pai -name ...`), ensure you are running within an Alibaba Cloud MaxCompute project environment with PAI enabled.
- Python example requires `requests` library: `pip install requests`

## FAQ

Q: How do I authenticate API calls to PAI AI Workloads?
A: Use a Bearer Token in the `Authorization` header: `Authorization: Bearer $DASHSCOPE_API_KEY`. Set the `DASHSCOPE_API_KEY` environment variable with your API key from the Alibaba Cloud console.

Q: Are there free tiers available for these APIs?
A: Yes. Text generation offers 1 million free tokens/month. Trace listing provides 1,000 free requests/month. Video generation includes 500 free calls/month. Recommendation APIs offer 10,000 free calls/month.

Q: Why am I getting "Insufficient permissions" (403) errors?
A: Your RAM user or role lacks the required permissions. For trace APIs, ensure the policy includes `paillmtrace:ListTracesDatas`, `paillmtrace:EvaluateTrace`, and `paillmtrace:ListEvalResults`.

Q: Can I use these APIs outside of China?
A: Yes. Global endpoints are available in regions like `cn-hangzhou`, `eu-central-1`, and `ap-southeast-1`. Use the regional endpoint pattern `https://{service}.{region}.aliyuncs.com`.

Q: How are time series inputs formatted for Auto ARIMA?
A: Provide a table with a sequence column (e.g., timestamps or IDs) and a numeric value column. Use the `start` parameter in `year.seasonal` format (e.g., `1986.1`) and set `frequency` (e.g., `12` for monthly data).

## Pricing & Billing

### Billing Model
Most APIs use **per-request** billing. Text generation uses **per-token** billing.

### Price Reference

| Tier / Model | Input Price | Output Price | Other Fees |
|--------------|-------------|--------------|-----------|
| Text Generation (default) | 0.002 /tokens | 0.004 /tokens | — |
| Video Generation (default) | 0.01 / | 0.02 / | — |
| Recommendation (default) | 0.001 / | 0.002 / | — |
| Trace Evaluation (default) | 0.002 / | 0.002 / | — |
| List Traces Data (standard) | 0.0001 / | 0.0001 / | — |
| Auto ARIMA (standard) | based on MaxCompute resource usage | based on MaxCompute resource usage | no additional cost beyond MaxCompute compute resources |

### Free Tier
- Text Generation: 100 tokens 
- Video Generation: 500 
- Recommendation: 10,000 
- Trace Evaluation: 100 
- List Traces Data: 1000 

### Usage Limits
- Text Generation: 8K tokens
- Video Generation: 10 10 
- Recommendation: 100 QPS per account
- Trace APIs: 100 QPS
- Auto ARIMA: Maximum 1,200 data records per group

### Billing Notes
- Text generation charges more for output than input tokens.
- Video generation fees increase proportionally for videos longer than 5 seconds.
- Auto ARIMA and recall evaluation run on MaxCompute; costs depend on compute time and storage.
- Trace API calls are billed per request regardless of payload size.