# rds-ai

Part of **RDS**

# ApsaraDB RDS AI Assistant and Conversational AI

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Get Model Operator Info | Synchronous | Retrieve information about available model operators for AI assistants. |
| Get Model Operator Order | Synchronous | Fetch details about model operator orders and subscriptions. |
| Query Token Usage Records | Synchronous | Retrieve detailed token usage records for AI assistant operations. |
| Get Conversation History | Synchronous | Retrieve the history of conversations with an AI assistant. |
| Get Conversation Messages | Synchronous | Fetch individual messages from a specific conversation thread. |
| Modify Message Feedback | Synchronous | Update user feedback for specific AI conversation messages. |
| List Skills | Synchronous | Retrieve a list of available AI skills. |
| List Custom Agent Tools | Synchronous | Retrieve a list of tools available to custom agents. |
| Query Instance List | Streaming | Use chat interface to query RDS instance information. |
| Embed RDS Copilot | Synchronous | Integrate RDS Copilot into custom web applications programmatically. |
| Manage AI Long-term Memory | Synchronous | Create and manage personalized AI memory records for user interactions. |

## API Calling Patterns

### Authentication
The primary authentication method is Bearer Token authentication.

- Include the header: `Authorization: Bearer <your_api_key>`
- Set the environment variable: `DASHSCOPE_API_KEY` for most APIs
- For the Long-term Memory service, use: `Authorization: Token <your_api_key>` with environment variable `MEM0_API_KEY`

### Service Endpoint (Endpoint)
The APIs use region-specific endpoints with the pattern: `https://{product}.{region}.aliyuncs.com`

Common regions include:
- `cn-hangzhou` (China Hangzhou)
- `cn-shanghai` (China Shanghai)
- `cn-beijing` (China Beijing)

For international regions, use `api.alibabacloud.com` instead of `api.aliyun.com`.

### Synchronous Pattern
Most APIs follow a synchronous request-response pattern:

1. Send an HTTP request (GET or POST) to the endpoint with required parameters
2. Include the Authorization header with your API key
3. Receive a JSON response immediately
4. Parse the response for success/failure status and data

### Streaming Pattern
The ChatMessages API uses a streaming pattern for real-time responses:

1. Send a POST request with query and inputs parameters
2. The server responds with a stream of JSON chunks using Server-Sent Events (SSE)
3. Each chunk contains an `event` field and an `answer` field
4. Continue reading chunks until the stream ends
5. The final response includes a complete answer and conversationId for multi-turn conversations

## Parameter Reference

### Get Model Operator Info

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| InstanceId | string | false | null | null | The instance ID. |

### Query Token Usage Records

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| Region | string | false | null | null | The region where the instance is located. |
| InstanceId | string | true | null | null | The instance ID. |
| ConsumerName | string | false | null | null | The consumer associated with the API key. |
| Model | string | false | null | null | The model name. |
| StartTime | string | false | null | ISO 8601 format, UTC | The beginning of the query's time range. Specify the time in ISO 8601 format and UTC. |
| EndTime | string | false | null | ISO 8601 format, UTC | The end of the query's time range. Specify the time in ISO 8601 format and UTC. |
| Page | integer | false | 1 | min: 1 | The page number. Valid values start from 1. Default value: 1. |
| PageSize | integer | false | null | null | The number of records to return on each page. |

### Get Conversation History

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| LastId | string | false | null | null | The operation that you want to perform. Set the value to GetConversations. |
| Limit | string | false | null | null | The ID of the last conversation. |
| Pinned | string | false | null | null | The number of entries per page. Valid values: 1 to 100. |
| SortBy | string | false | null | null | Specifies whether to pin the application. |

### Get Conversation Messages

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| ConversationId | string | false | null | null | The ID of the conversation. |
| FirstId | string | false | null | null | The ID of the message from which to start fetching the list. Use this for pagination. |
| Limit | integer | false | 100 | range 1-100 | The maximum number of messages to return per page. Valid values: 1–100. Default: 100. |

### Modify Message Feedback

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| MessageId | string | false | null | null | The operation that you want to perform. Set the value to ModifyMessagesFeedbacks. |
| Rating | string | false | null | one of: like, dislike | The message ID. |
| Content | string | false | null | max length 6000 chars | The rating of the message. Valid values: like, dislike. |

### List Skills

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| PageNumber | integer | false | 1 | default: 1 | The page number. Pages start from page 1. Default value: 1. |
| PageSize | integer | false | 20 | default: 20, max: 100 | The number of records to return on each page. Default value: 20. Maximum value: 100. |
| Language | string | false | null | one of: zh-CN, zh-TW, en-US, ja-JP | The languages supported by the skills. |

### Query Instance List (Chat Interface)

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| query | string | true | null | null | The natural language question or command to send to the chat interface. |
| inputs | object | true | null | null | Additional input parameters including region, timezone, language, and optional custom agent ID. |
| conversationId | string | false | null | null | Unique identifier for a multi-turn conversation session. If provided, continues the existing conversation; if not, starts a new one. |
| customAgentId | string | false | null | null | ID of a custom agent to route the query through. Enables specialized behavior based on predefined system prompts and tools. |

### Manage AI Long-term Memory

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| messages | array | true | null | null | List of message objects containing role and content for memory creation. |
| user_id | string | true | null | null | Unique identifier for the user whose memory is being managed. |
| agent_id | string | false | null | null | Identifier for the AI agent associated with the memory. |
| enable_graph | boolean | false | false | true or false | Whether to enable graph-based memory storage and retrieval. |
| query | string | true | null | null | Text query used to search for relevant memories. |
| limit | integer | false | 10 | range 1-100 | Maximum number of results to return in a search. |
| memory_id | string | true | null | null | Unique identifier of the memory to retrieve, update, or delete. |

## Code Examples

### Query Token Usage Records - Python - All Regions

```python
import dashscope

# Set your API key
dashscope.api_key = 'your-api-key'

# Define the parameters
params = {
    'InstanceId': 'rds_copilot***_public_cn-*********6',
    'StartTime': '2025-12-13T16:00:00Z',
    'EndTime': '2026-01-04T16:00:00Z',
    'Page': 1,
    'PageSize': 10
}

# Call the API
response = dashscope.RdsAi.DescribeMOTokenUsageDetail.call(**params)

# Print the response
print(response)
```

### List Skills - Python - All Regions

```python
import requests

url = "https://rdsai.aliyuncs.com/api/v1/skills"
params = {
    "PageNumber": 1,
    "PageSize": 20,
    "Language": "zh-CN"
}
headers = {
    "Authorization": "Bearer <your-api-key>",
    "Content-Type": "application/json"
}

response = requests.get(url, params=params, headers=headers)
print(response.json())
```

### Query Instance List - Python - All Regions

```python
# -*- coding: utf-8 -*-
import os
import sys
from typing import List

from alibabacloud_rdsai20250507.client import Client as RdsAi20250507Client
from alibabacloud_tea_openapi import models as open_api_models
from alibabacloud_rdsai20250507 import models as rds_ai_20250507_models
from alibabacloud_tea_util import models as util_models

class ChatMessagesSingleTurnDemo:
    def __init__(self):
        pass

    @staticmethod
    def create_client() -> RdsAi20250507Client:

        # Make sure that the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables are set.
        config = open_api_models.Config(
            access_key_id=os.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"),
            access_key_secret=os.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET")
        )
        # For the endpoint, see https://api.aliyun.com/product/RdsAi
        config.endpoint = 'rdsai.aliyuncs.com'
        return RdsAi20250507Client(config)

    @staticmethod
    def main() -> None:
        client = ChatMessagesSingleTurnDemo.create_client()
        # To use a custom agent for the conversation, pass the custom agent ID. The custom agent ID is returned after you successfully call the CreateCustomAgent operation. You can also call the ListCustomAgent operation to query the list of created custom agents. For more information, see the ListCustomAgent API documentation.
        # inputs = rds_ai_20250507_models.ChatMessagesRequestInputs(language="zh-CN", region_id="cn-hangzhou", timezone="Asia/Shanghai",
        #                                                  custom_agent_id="5f1bbe8a-88d8-4a72-81e5-2a5d9d43****")
        inputs = rds_ai_20250507_models.ChatMessagesRequestInputs(
            language="zh-CN",
            region_id="cn-hangzhou",
            timezone="Asia/Shanghai"
        )
        # For a multi-turn conversation, specify the conversationId. The conversation ID is returned after you successfully call the ChatMessages operation. For more information, see the ChatMessages API documentation.
        chat_messages_request = rds_ai_20250507_models.ChatMessagesRequest(
            query="Query the list of instances in the China (Hangzhou) region",
            inputs=inputs
        )

        runtime = util_models.RuntimeOptions()
        chat_messages_response = client.chat_messages_with_sse(tmp_req=chat_messages_request, runtime=runtime)
        for chunk in chat_messages_response:
            body = chunk.body
            if body is not None and body.event == 'message':
                print(f"{body.answer}", end="")

if __name__ == '__main__':
    ChatMessagesSingleTurnDemo.main()
```

### Embed RDS Copilot - Python - All Regions

```python
import os
import json
import requests
from flask import Flask, render_template, jsonify
from aliyunsdkcore.client import AcsClient
from aliyunsdksts.request.v20150401.AssumeRoleRequest import AssumeRoleRequest

app = Flask(__name__, template_folder='.')

# Read configuration from environment variables
ACCOUNT_ID = os.environ.get("ALIYUN_ACCOUNT_ID")
ACCESS_KEY_ID = os.environ.get("ALIYUN_ACCESS_KEY_ID")
ACCESS_KEY_SECRET = os.environ.get("ALIYUN_ACCESS_KEY_SECRET")
RAM_ROLE = os.environ.get("ALIYUN_RAM_ROLE", "rdscopilot-console")
REGION_ID = "cn-hangzhou"
SIGN_IN_DOMAIN = "https://signin.aliyun.com/federation"
DESTINATION = "https://rdsnext4servims.console.alibabacloud.com/rdsCopilotFree/cn-hangzhou?hideTopbar=true&amp;copilotReference_min"

def assume_role():
    """Step 1: Get temporary credentials from STS."""
    client = AcsClient(ACCESS_KEY_ID, ACCESS_KEY_SECRET, REGION_ID)
    request = AssumeRoleRequest()
    request.set_RoleArn(f"acs:ram::{ACCOUNT_ID}:role/{RAM_ROLE}")
    request.set_RoleSessionName("rds-ai-assistant-session")
    request.set_DurationSeconds(3600)  # Credentials expire after 1 hour

    response = client.do_action_with_exception(request)
    return json.loads(response)["Credentials"]

def get_signin_token(ak, sk, token):
    """Step 2: Exchange temporary credentials for a sign-in token."""
    params = {
        "Action": "GetSigninToken",
        "AccessKeyId": ak,
        "AccessKeySecret": sk,
        "SecurityToken": token,
        "TicketType": "mini"  # Required for iframe embedding
    }
    resp = requests.get(SIGN_IN_DOMAIN, params=params)
    if resp.status_code == 200:
        return resp.json().get("SigninToken")
    else:
        raise Exception(f"Failed to get SigninToken: {resp.status_code}, {resp.text}")

def get_login_url(signin_token):
    """Step 3: Construct the logon-free URL."""
    params = {
        "Action": "Login",
        "LoginUrl": "https://signin.aliyun.com/login.htm",
        "Destination": DESTINATION,
        "SigninToken": signin_token
    }
    req = requests.Request('GET', SIGN_IN_DOMAIN, params=params).prepare()
    return req.url

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/get_console_url')
def get_url():
    try:
        creds = assume_role()
        signin_token = get_signin_token(
            creds['AccessKeyId'],
            creds['AccessKeySecret'],
            creds['SecurityToken']
        )
        login_url = get_login_url(signin_token)
        return jsonify({"url": login_url})
    except Exception as e:
        return jsonify({"error": str(e)}), 500

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5023, debug=True)
```

### Manage AI Long-term Memory - Python - All Regions

```python
import os
import json
import time
from mem0 import MemoryClient

# ============================================

def main():
    # Get configuration from environment variables
    api_key = os.environ.get("MEM0_API_KEY")
    host = os.environ.get("MEM0_HOST")

    print("=" * 60)
    print("Mem0 SDK Basic Functionality Test")
    print("=" * 60)
    print(f"Service endpoint: {host}")

    # Initialize the client
    client = MemoryClient(host=host, api_key=api_key)

    # Test user
    user_id = f"sdk_test_user_{int(time.time())}"
    print(f"Test user: {user_id}")
    print("=" * 60)

    memory_id = None

    try:
        # =================== 1. Add a memory (Add) ===================
        print("\n[1] Add a memory (Add)")
        print("-" * 40)

        messages = [
            {"role": "user", "content": "My name is Zhang San. I work in Beijing and I like playing basketball."},
            {"role": "assistant", "content": "Hi Zhang San! Beijing is a great city, and basketball is a great sport."},
        ]

        result = client.add(messages, user_id=user_id)
        print(f"Add result: {json.dumps(result, ensure_ascii=False, indent=2)}")

        # Wait for processing to complete
        time.sleep(2)

        # =================== 2. Get all memories (Get All) ===================
        print("\n[2] Get all memories (Get All)")
        print("-" * 40)

        all_memories = client.get_all(user_id=user_id)
        print(f"Memory list: {json.dumps(all_memories, ensure_ascii=False, indent=2)}")

        memories = all_memories.get("results", [])
        print(f"Total memories: {len(memories)}")

        if memories:
            memory_id = memories[0]["id"]
            print(f"First memory ID: {memory_id}")

        # =================== 3. Search memories (Search) ===================
        print("\n[3] Search memories (Search)")
        print("-" * 40)

        search_result = client.search("Where does Zhang San work?", user_id=user_id)
        print(f"Search result: {json.dumps(search_result, ensure_ascii=False, indent=2)}")

        # =================== 4. Get a single memory (Get) ===================
        print("\n[4] Get a single memory (Get)")
        print("-" * 40)

        if memory_id:
            single_memory = client.get(memory_id)
            print(f"Memory details: {json.dumps(single_memory, ensure_ascii=False, indent=2)}")
        else:
            print("No available memory ID")

        # =================== 5. Update a memory (Update) ===================
        print("\n[5] Update a memory (Update)")
        print("-" * 40)

        if memory_id:
            new_content = "Zhang San now works in Shanghai and still likes to play basketball"
            update_result = client.update(memory_id, new_content)
            print(f"Update result: {json.dumps(update_result, ensure_ascii=False, indent=2)}")
        else:
            print("No available memory ID")

        # =================== 6. Get memory history (History) ===================
        print("\n[6] Get memory history (History)")
        print("-" * 40)

        if memory_id:
            history = client.history(memory_id)
            print(f"History: {json.dumps(history, ensure_ascii=False, indent=2)}")
        else:
            print("No available memory ID")

        # =================== 7. Delete a single memory (Delete) ===================
        print("\n[7] Delete a single memory (Delete)")
        print("-" * 40)

        if memory_id:
            delete_result = client.delete(memory_id)
            print(f"Delete result: {json.dumps(delete_result, ensure_ascii=False, indent=2)}")
        else:
            print("No available memory ID")

    except Exception as e:
        print(f"\nError: {e}")
        import traceback
        traceback.print_exc()
        return 1

    return 0

if __name__ == "__main__":
    exit(main())
```

### Query Token Usage Records - curl - All Regions

```bash
curl -X GET 'https://api.aliyun.com/api/RdsAi/2025-05-07/DescribeMOTokenUsageDetail?Region=cn-beijing&InstanceId=rds_copilot***_public_cn-*********6&StartTime=2026-01-04T16:00:00Z&EndTime=2025-12-13T16:00:00Z&Page=2&PageSize=10' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY'
```

### List Skills - curl - All Regions

```bash
curl -X GET 'https://rdsai.api.aliyun.com/2025-05-07/ListSkill?PageNumber=1&PageSize=20&Language=zh-CN' \
-H 'Authorization: Bearer $DASHSCOPE_API_KEY' \
-H 'x-api-key: $DASHSCOPE_API_KEY'
```

### Manage AI Long-term Memory - bash - All Regions

```bash
curl -X POST "http://<host>/memory/v1/memories/" \
  -H "Authorization: Token <api-key>" \
  -H "Content-Type: application/json" \
  -d '{
  "messages": [
    {"role": "user", "content": "My name is Zhang San and I work at Alibaba"}
  ],
  "user_id": "user_001",
  "agent_id": "my_agent",
  "enable_graph": false
}'
```

## Response Format (Response Format)

```json
{
  "RequestId": "FE9C65D7-930F-57A5-A207-8C396329241C",
  "Success": true,
  "Message": "success",
  "Data": {
    "TotalQuota": 200000000,
    "UsedQuota": 1000000,
    "AutoRenew": true,
    "InstanceId": "rds_copilot***_public_cn-*********6",
    "InstanceClass": "xlarge",
    "Status": "active/creating",
    "StartTime": 1772439028000,
    "EndTime": 1775145600000,
    "ApiKey": "sk-rds-xxx",
    "BaseUrl": "http://xxx.yy/v1",
    "ChargeType": "PREPAY / POSTPAY",
    "DailyUsage": [
      {
        "Date": "2026-03-31",
        "Usage": 100000
      }
    ],
    "KeyUsageList": [
      {
        "ApiKey": "sk-rds-*****",
        "KeyName": "api-*****",
        "KeyType": "fixed",
        "KeyUsed": "1000000",
        "DailyUsage": [
          {
            "Date": "2026-03-31",
            "Usage": "2000"
          }
        ],
        "UsedQuota": "2000000",
        "Deleted": true
      }
    ]
  }
}
```

**Key Fields**:
- `RequestId` — Unique identifier for the API request
- `Success` — Indicates whether the request was successful
- `Message` — Status message describing the result
- `Data.TotalQuota` — Total token quota allocated to the instance
- `Data.UsedQuota` — Number of tokens already consumed
- `Data.AutoRenew` — Whether the subscription auto-renews
- `Data.InstanceId` — Unique identifier for the RDS AI instance
- `Data.InstanceClass` — Size/class of the AI instance (e.g., xlarge)
- `Data.Status` — Current status of the instance (active/creating)
- `Data.StartTime` — Start time of the subscription (timestamp)
- `Data.EndTime` — End time of the subscription (timestamp)
- `Data.ApiKey` — API key for accessing the service
- `Data.BaseUrl` — Base URL for making API calls
- `Data.ChargeType` — Billing method (PREPAY or POSTPAY)
- `Data.DailyUsage` — Daily token consumption records
- `Data.KeyUsageList` — Detailed usage per API key

### Streaming Chunk Format (Streaming Chunk Format)
```json
{"event":"message","answer":"The list of instances in the China (Hangzhou) region is: instance-12345, instance-67890."}
```

## Error Handling (Error Handling)

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|-------------------|---------------------------|----------------------------------------|
| 400 | Invalid IP address or CIDR in whitelist. | Verify that your IP address is correctly added to the whitelist in CIDR notation. |
| 400 | Domain is not ready yet, please try again later. | Wait a few minutes and retry the request. The domain may still be provisioning. |
| 403 | There is no valid order for this UID | Purchase or activate the RDS AI Assistant Ultimate Edition before using this API. |
| InvalidParameter | One or more parameters are invalid. Check the parameter values and ensure they meet the required format and constraints. | Review the API documentation for parameter requirements and correct your request. |
| UnauthorizedOperation | The request is not authorized. Ensure the API key has sufficient permissions to access the resource. | Verify your API key permissions and ensure it has the necessary RAM policies attached. |
| Throttling | The request rate exceeds the allowed limit. Reduce the frequency of requests or implement exponential backoff. | Implement rate limiting in your client or request a quota increase from Alibaba Cloud. |
| 404 | User does not exist. | Verify that the user ID or conversation ID is correct and exists in the system. |
| 401 | Unauthorized access. Verify your API key is correct and properly formatted in the Authorization header. | Check that your API key is valid and correctly included in the Authorization header. |
| 429 | Too many requests. You have exceeded the rate limit. Wait before retrying or contact support to increase limits. | Implement exponential backoff in your retry logic or reduce request frequency. |
| 500 | Internal server error. Retry after a short delay. If persistent, contact support. | Retry the request after a brief delay. If the error persists, contact Alibaba Cloud support. |

### Rate Limits & Retry
- Query Token Usage Records: 100 QPS per account
- List Skills: 100 QPS per user
- List Custom Agent Tools: 100 QPS per account
- Query Instance List: 100 QPS per account
- Manage AI Long-term Memory: 100 QPS per user ID

For rate-limited APIs, implement exponential backoff with jitter. When receiving a 429 error, respect the Retry-After header if present, or wait at least 1 second before retrying.

## Environment Requirements (Requirements)

- For Query Token Usage Records: `dashscope>=1.14.0`
- For Query Instance List: `alibabacloud_rdsai20250507>=1.0.0, alibabacloud_tea_openapi>=1.0.0, alibabacloud_tea_util>=1.0.0`
- For Embed RDS Copilot: `Flask==2.3.0, requests==2.31.0, aliyun-python-sdk-core==2.13.36, aliyun-python-sdk-sts==3.1.0`
- For Manage AI Long-term Memory: `mem0ai>=1.0.0`

Environment variables:
- `export DASHSCOPE_API_KEY=your_key` (for most APIs)
- `export MEM0_API_KEY=your_key` and `export MEM0_HOST=http://<YOUR-HOST>:80/memory` (for Long-term Memory)
- `export ALIBABA_CLOUD_ACCESS_KEY_ID=your_id` and `export ALIBABA_CLOUD_ACCESS_KEY_SECRET=your_secret` (for embedding)

## FAQ

Q: How do I authenticate with the RDS AI Assistant APIs?
A: Most APIs use Bearer Token authentication with the `DASHSCOPE_API_KEY` environment variable. Set the header `Authorization: Bearer $DASHSCOPE_API_KEY` in your requests. For the Long-term Memory service, use `Authorization: Token $MEM0_API_KEY` instead.

Q: What regions are supported for the RDS AI Assistant APIs?
A: The APIs support multiple regions including cn-hangzhou, cn-shanghai, and cn-beijing. Use `api.aliyun.com` for China regions and `api.alibabacloud.com` for international regions. Some APIs like ListSkill use a global endpoint (`rdsai.aliyuncs.com`).

Q: How can I handle streaming responses from the ChatMessages API?
A: The ChatMessages API returns Server-Sent Events (SSE) with JSON chunks. Each chunk contains an `event` field (typically "message") and an `answer` field with partial response text. Continue reading chunks until the stream ends to get the complete response.

Q: What should I do if I receive a 403 error when calling the APIs?
A: A 403 error typically means you don't have a valid subscription to the RDS AI Assistant Ultimate Edition. Check your order status using the GetModelOperatorOrder API and ensure you have an active subscription.

Q: How do I embed RDS Copilot in my web application securely?
A: Use the federation endpoint approach with RAM roles and STS temporary credentials. Create a backend service that generates time-limited logon-free URLs using the RAM user's credentials, then load these URLs in an iframe on your frontend. Never expose permanent credentials in client-side code.

## Pricing & Billing

### Billing Model
- Get Model Operator Info: per_token
- Get Model Operator Order: subscription
- Query Token Usage Records: per_token
- Get Conversation History: per_request
- List Skills: per_request
- List Custom Agent Tools: per_request
- Query Instance List: per_request
- Embed RDS Copilot: free
- Manage AI Long-term Memory: per_request

### Price Reference

| Tier/Model | Input Price | Output Price | Other Fees |
|------------|-------------|--------------|------------|
| RDS AI Assistant Ultimate Edition | 0.002 /tokens | 0.002 /tokens |
| qwen-flash | 0.0001 /tokens | 0.0001 /tokens |
| default | 0.002 / | 0.002 / |
| standard | 0.0001 / | 0.0001 / |
| Professional Edition | 0.002 / | 0.003 / |
| RDS for PostgreSQL instance | varies by instance type | varies by instance type | |
| LLM and embedding model invocation | pay-as-you-go | pay-as-you-go | |

### Free Tier
- Query Token Usage Records: 100 tokens 
- Get Conversation History: 1000 
- List Skills: 1000 
- List Custom Agent Tools: 1000 
- Query Instance List: 100 
- Embed RDS Copilot: No cost to embed RDS Copilot in a custom application. Usage of RDS Copilot features is subject to the underlying RDS instance billing model.
- Manage AI Long-term Memory: Compute resources for Long-term Memory service are currently free

### Usage Limits
- Query Token Usage Records: 100 QPS, 8K tokens
- Get Conversation History: 100 QPS
- List Skills: 100 QPS, 100 
- List Custom Agent Tools: 100 QPS per account
- Query Instance List: 100 QPS
- Manage AI Long-term Memory: 100 QPS per user ID

### Billing Notes

- Token usage is billed based on total tokens consumed. Charges apply only when the request is successfully processed.

- Each API call counts as one request. Free tier resets monthly.

- The logon-free URL must be generated per request. Caching or reusing URLs will result in access failure. No additional charges apply for embedding.
- The official billing start date will be announced separately. Model invocation fees are charged per API call.