# pai-knowledge

Part of **PAI**

# Platform for AI (PAI) Knowledge Management

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Manage Knowledge Bases | Synchronous | Create, delete, get, list, or update knowledge bases. |
| Manage Knowledge Base Jobs | Synchronous | Create, delete, get, list, or update knowledge base jobs. |
| Manage Knowledge Base Chunks | Synchronous | List or update knowledge base chunks. |
| Retrieve Knowledge | Synchronous | Retrieve content from a knowledge base using supported models. |
| Upload File Chunk | Synchronous | Upload a chunk of a file to a knowledge base. |
| Configure Knowledge Base Job | Synchronous | Configure a knowledge base job with VPC settings. |

## API Calling Patterns

### Authentication
The primary authentication method is **Bearer Token**.

- Use the header: `Authorization: Bearer <your_api_key>`
- Store your API key in the environment variable: `DASHSCOPE_API_KEY`
- Example: `export DASHSCOPE_API_KEY=sk-xxxxxx`

### Service Endpoint (Endpoint)
All APIs use a single global endpoint pattern:

- Base URL: `https://api.alibabacloud.com/api/PAILangStudio/2024-07-10/{Operation}`
- The service is region-agnostic; no regional endpoints are required.
- Common regions referenced in data sources include `cn-hangzhou`, `cn-shanghai`, and `cn-beijing` (for OSS URIs).

### Synchronous API Pattern
All operations in this domain follow a **Synchronous** calling pattern:
1. Send an HTTP request (GET, POST, PUT, or DELETE) to the operation-specific endpoint.
2. Include the `Authorization: Bearer $DASHSCOPE_API_KEY` header.
3. For POST/PUT requests, provide a JSON body with required parameters.
4. Receive an immediate JSON response with results or error details.
5. No polling or async status checks are needed—responses are returned directly.

## Parameter Reference

### Manage Knowledge Bases

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| WorkspaceId | string | true | - | - | The ID of the workspace. |
| Name | string | true | - | max length 127 chars, starts with letter, contains only letters, numbers, or underscores | The name of the knowledge base. |
| KnowledgeBaseType | string | true | - | one of: TEXT, STRUCTURED, IMAGE, VIDEO | The type of the knowledge base. |
| OutputDir | string | true | - | - | Storage path for output data. |
| DataSources | array<object> | true | - | - | Data source. |
| ChunkConfig | object | true | - | - | File slicing configuration. |
| EmbeddingConfig | object | true | - | - | Vector index configuration. |
| VectorDBConfig | object | true | - | - | Vector store configuration. |
| Accessibility | string | false | - | one of: PRIVATE, PUBLIC | The visibility of the workspace. |
| Description | string | false | - | - | Custom description of the knowledge base. |

### Retrieve Knowledge

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| WorkspaceId | string | true | - | - | Workspace ID where the knowledge base is located. |
| Query | string | true | - | - | Retrieval content. |
| KnowledgeBaseId | string | false | - | - | Knowledge base ID. |
| TopK | integer | false | - | - | Number of top-ranked results to return. |
| ScoreThreshold | float | false | - | range [0, 1] | Similarity score threshold. |
| QueryMode | string | false | - | one of: dense, hybrid | Retrieval mode. |
| VersionName | string | false | v1 | - | Knowledge base version. |
| MetaDataFilterConditions | string | false | - | - | Metadata filter conditions (JSON string). |
| RerankConfig | string | false | - | - | Rerank configuration (JSON string). |
| RewriteConfig | string | false | - | - | Query rewrite configuration (JSON string). |
| HybridStrategyConfig | string | false | - | - | Hybrid retrieval strategy config (JSON string). |

### Manage Knowledge Base Jobs

| Parameter | Type | Required | Default | Constraints | Description |
|----------|------|----------|---------|-------------|-------------|
| WorkspaceId | string | true | - | - | The ID of the workspace. |
| JobAction | string | false | - | one of: SyncIndex | The type of the task operation. |
| MaxRunningTimeInSeconds | integer | false | - | range 1-86400 | Maximum running time for the task, in seconds. |
| EcsSpecs | array<object> | false | - | - | Task run resource configuration list. |
| UserVpc | object | false | - | - | Task run VPC info. |
| EmbeddingConfig | object | false | - | - | Index configuration. |
| KnowledgeBaseId | string | false | - | - | The ID of the Knowledge Base. |
| Description | string | false | - | - | Knowledge base task description. |

## Code Examples

### Create a Knowledge Base - Python - all

```python
import requests
import json

# Set your API endpoint and headers
url = "https://api.alibabacloud.com/api/PAILangStudio/2024-07-10/CreateKnowledgeBase"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Request body
payload = {
    "WorkspaceId": "478**",
    "Name": "myName",
    "KnowledgeBaseType": "TEXT",
    "OutputDir": "oss://test-bucket.oss-cn-hangzhou-internal.aliyuncs.com/langstudio/output/",
    "DataSources": [
        {
            "Uri": "oss://test-bucket.oss-cn-hangzhou-internal.aliyuncs.com/langstudio/source/"
        }
    ],
    "ChunkConfig": {
        "ChunkSize": 1024,
        "ChunkOverlap": 200,
        "ChunkDuration": 30,
        "ChunkStrategy": "Default"
    },
    "EmbeddingConfig": {
        "ConnectionId": "conn-r3o7******38bh",
        "Model": "text-embedding-v4"
    },
    "VectorDBConfig": {
        "VectorDBType": "Milvus",
        "ConnectionId": "conn-7y5y******jja7",
        "CollectionName": "my_collection"
    }
}

# Make the API call
response = requests.post(url, headers=headers, data=json.dumps(payload))

# Handle response
if response.status_code == 200:
    result = response.json()
    print(f"Knowledge base created successfully. ID: {result['KnowledgeBaseId']}")
else:
    print(f"Error: {response.status_code} - {response.text}")
```

### List Knowledge Bases - Python - all

```python
import requests

url = "https://api.alibabacloud.com/api/PAILangStudio/2024-07-10/ListKnowledgeBases"
headers = {
    "Authorization": "Bearer $DASHSCOPE_API_KEY",
    "Content-Type": "application/json"
}
params = {
    "WorkspaceId": "478***",
    "Name": "myName",
    "KnowledgeBaseType": "TEXT"
}

response = requests.get(url, headers=headers, params=params)
print(response.json())
```

### Retrieve Knowledge with Reranking - Python - all

```python
import requests
import json

url = "https://api.alibabacloud.com/api/PAILangStudio/2024-07-10/RetrieveKnowledgeBase"
headers = {
    "Authorization": "Bearer $DASHSCOPE_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "body": {
        "WorkspaceId": "174***",
        "Query": "red car",
        "TopK": 5,
        "ScoreThreshold": 0.5,
        "QueryMode": "dense",
        "VersionName": "v1",
        "RerankConfig": "{\"ConnectionId\":\"conn-xxx\",\"Model\":\"qwen-max\",\"TopK\":5}",
        "RewriteConfig": "{\"ConnectionId\":\"conn-xxx\",\"Model\":\"qwen-max\",\"Temperature\":0.7,\"TopP\":0.9,\"PresencePenalty\":0.5,\"FrequencyPenalty\":0.5,\"Seed\":0,\"MaxTokens\":1024,\"Stop\":[],\"EnableThinking\":true}",
        "HybridStrategyConfig": "{\"Strategy\":\"rrf\",\"RRFK\":60,\"Weight\":0.5}"
    }
}

response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json())
```

### Create a Knowledge Base Job - Bash - all

```bash
curl -X POST 'https://api.alibabacloud.com/api/PAILangStudio/2024-07-10/CreateKnowledgeBaseJob' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
  "WorkspaceId": "478**",
  "JobAction": "SyncIndex",
  "MaxRunningTimeInSeconds": 86400,
  "EcsSpecs": [
    {
      "Type": "Worker",
      "InstanceType": "ecs.c6.large",
      "PodCount": 1,
      "CPU": 2,
      "GPU": 1,
      "Memory": 8,
      "SharedMemory": 16,
      "GPUType": "16",
      "Driver": "535.161.08"
    }
  ],
  "UserVpc": {
    "VpcId": "vpc-wz90****5v23",
    "VSwitchId": "vsw-wz9r****ng10",
    "SecurityGroupId": "sg-wz9i****1129"
  },
  "EmbeddingConfig": {
    "BatchSize": 8,
    "Concurrency": 1
  }
}'
```

## Response Format (Response Format)

```json
{
  "WorkspaceId": "478**",
  "KnowledgeBaseId": "d-ksicx823d",
  "RequestId": "48E6392E-C3C9-5212-9FAD-13256ABD9AF6"
}
```

**Key Fields**:
- `KnowledgeBaseId` — Unique identifier for the created knowledge base
- `WorkspaceId` — ID of the workspace associated with the knowledge base
- `RequestId` — Unique ID for the API request (useful for debugging)

## Error Handling (Error Handling)

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|-------------------|----------------------------|----------------------------------------|
| 400 | Invalid request parameters. Check the request body for missing or malformed fields. | Validate all required fields and ensure values meet constraints (e.g., enum values, string formats). |
| 401 | Unauthorized access. Verify your API key or RAM permissions. | Ensure `DASHSCOPE_API_KEY` is valid and correctly set in the Authorization header. |
| 403 | Access denied. The RAM user or role does not have permission to perform this operation. | Grant the RAM user the `pailangstudio:CreateKnowledgeBase` (or relevant) action with write access. |
| 404 | Resource not found. The specified workspace ID or connection ID does not exist. | Verify workspace and connection IDs using `ListWorkspaces` and `ListConnections`. |
| 429 | Too many requests. Rate limit exceeded. Wait before retrying. | Implement exponential backoff; respect the 100 QPS limit. |
| 500 | Internal server error. Retry the request after a short delay. | Retry with jittered backoff; contact Alibaba Cloud support if persistent. |

### Rate Limits & Retry
- **QPS Limit**: 100 queries per second per API key/account
- **Retry Strategy**: Use exponential backoff with jitter for 429/500 errors
- **Retry-After Header**: Not currently used; implement fixed delay (e.g., 1s, 2s, 4s)

## Environment Requirements (Requirements)

- Install the DashScope SDK: `pip install dashscope>=1.14.0`
- Set your API key: `export DASHSCOPE_API_KEY=your_key_here`
- Python 3.8+ recommended for SDK compatibility

## FAQ

Q: How do I obtain a WorkspaceId?
A: Call the `ListWorkspaces` API to retrieve available workspace IDs in your account.

Q: What embedding models are supported for different knowledge base types?
A: TEXT/STRUCTURED: text-embedding-v1 to v4; IMAGE: multimodal-embedding-v1; VIDEO: qwen2.5-vl-embedding.

Q: Can I use my own vector database?
A: Yes, supported types are Elasticsearch, Milvus, and Faiss (Faiss only for TEXT/STRUCTURED).

Q: How do I filter retrieved results by metadata?
A: Use the `MetaDataFilterConditions` parameter as a JSON string with `and`/`or` logic and operators like `==`, `!=`, or `contains`.

Q: Are knowledge base operations synchronous or asynchronous?
A: All API calls are synchronous—responses return immediately without requiring polling.

## Pricing & Billing

### Billing Model
Per-request billing for all operations (create, list, retrieve, etc.).

### Price Reference

| Tier | Input Price | Output Price |
|------|-------------|--------------|
| standard | 0.002 /tokens | 0.003 /tokens |
| default | 0.001 / | 0.001 / |
| standard | 0.001 / | 0.002 / |

### Free Tier
- Monthly free quota of 100 tokens for embedding-related operations
- 10,000 free retrieval requests per month
- 100–1000 free API calls per month depending on operation type

### Usage Limits
- 100 QPS per API key/account
- Single request max 8K–8192 tokens depending on operation
- Max job runtime: 86400 seconds (24 hours)

### Billing Notes
- Creating a knowledge base triggers vector embedding computation, billed by input token count
- Retrieval requests are billed per call regardless of result size
- Exceeding free tier leads to standard per-request charges