# bailian-integration

Part of **BAILIAN**

<!-- intent-backlink:auto -->

> 💡 **Path Selection**: This skill is one implementation path for [Integrate external tools, MCP servers, and web search into AI agents](../../intent/bailian-integrate-mcp/SKILL.md). If you're unsure which path to take, check the routing skill first.

# Bailian Developer Tools and Support

## Capabilities Overview

| Sub-capability | Models | API Pattern | Description |
|----------------|--------|-------------|-------------|
| MCP Server Connection | qwen3.7-max, qwen3.6-plus, qwen3.5-plus + 5 more | OpenAI Compatible (Streaming) | Connect LLMs to external tools and data sources using the Model Context Protocol (MCP). |

## Model Selection Guide

### MCP Server Connection

| Model ID | API Pattern |
|----------|-------------|
| qwen3.7-max | OpenAI Compatible (Streaming) |
| qwen3.6-plus | OpenAI Compatible (Streaming) |
| qwen3.5-plus | OpenAI Compatible (Streaming) |
| qwen3.6-flash | OpenAI Compatible (Streaming) |
| qwen3.5-flash | OpenAI Compatible (Streaming) |
| qwen3.6-27b | OpenAI Compatible (Streaming) |
| qwen3.6-open-source | OpenAI Compatible (Streaming) |
| qwen3.5-open-source | OpenAI Compatible (Streaming) |

## API Calling Patterns

### Authentication

**Primary Method: Bearer Token**

Use the `DASHSCOPE_API_KEY` environment variable with Bearer token authentication.

```http
Authorization: Bearer $DASHSCOPE_API_KEY
```

Set the environment variable before making API calls:

```bash
export DASHSCOPE_API_KEY=your_api_key_here
```

### Service Endpoint

The Responses API uses region-specific base URLs:

- **China Region**: `https://dashscope.aliyuncs.com/compatible-mode/v1/responses`
- **International Region**: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1/responses`

Both endpoints accept POST requests with JSON payloads.

### OpenAI Compatible (Streaming) Pattern

MCP integration uses the Responses API (`client.responses.create`), which is compatible with the OpenAI SDK. This pattern supports both synchronous and streaming responses.

**Calling Flow:**

1. Initialize the OpenAI client with the Bailian base URL and API key
2. Configure the MCP tool object with server details (protocol, label, URL, headers)
3. Call `client.responses.create()` with the model, input prompt, and tools array
4. Parse the response object to extract `output_text` and `usage` metrics

**Key Headers:**
- `Authorization: Bearer $DASHSCOPE_API_KEY` - Authentication
- `Content-Type: application/json` - Request format

**MCP Tool Configuration:**

The MCP tool object must include:
- `type`: Always set to "mcp"
- `server_protocol`: Currently only "sse" (Server-Sent Events) is supported
- `server_label`: A unique identifier for the MCP server
- `server_url`: The SSE endpoint URL of the MCP server
- `headers`: Optional authentication headers for the MCP server (if required)
- `server_description`: Optional but recommended description to help the model understand the server's capabilities

## Parameter Reference

### MCP Tool Configuration

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| type | string | Yes | - | Must be "mcp" | Tool type identifier |
| server_protocol | string | Yes | - | one of: sse | Communication protocol with the MCP server |
| server_label | string | Yes | - | - | Label name to identify the MCP server |
| server_description | string | No | - | - | Description of the MCP server's features to improve model understanding |
| server_url | string | Yes | - | - | Endpoint URL of the MCP server |
| headers | object | No | - | - | Request headers for MCP server authentication (e.g., Authorization) |

### Responses API Parameters

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| model | string | Yes | - | See Model Selection Guide | Model ID to use for inference |
| input | string | Yes | - | - | User prompt or question |
| tools | array | Yes | - | - | Array of tool configurations (MCP tool objects) |

## Code Examples

### MCP Server Connection - Python - China Region

```python
import os
from openai import OpenAI

client = OpenAI(
    # If no environment variable, use: api_key="sk-xxx" (not recommended).
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)

# MCP tool configuration
mcp_tool = {
    "type": "mcp",
    "server_protocol": "sse",
    "server_label": "WebParser",
    "server_description": "WebParser MCP service for parsing web page content.",
    "server_url": "https://dashscope.aliyuncs.com/api/v1/mcps/WebParser/sse",
    "headers": {
        "Authorization": "Bearer " + os.getenv("DASHSCOPE_API_KEY")
    }
}

response = client.responses.create(
    model="qwen3.6-plus",
    input="Which models are supported in https://help.aliyun.com/zh/model-studio/mcp ?",
    tools=[mcp_tool]
)

print("[Model Response]")
print(response.output_text)
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
```

**Note:** For International Region, use `base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"` and replace `server_url` with your MCP server's SSE endpoint.

### MCP Server Connection - Node.js - China Region

```javascript
import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // If no environment variable, use: apiKey: "sk-xxx" (not recommended).
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    // MCP tool configuration
    const mcpTool = {
        type: "mcp",
        server_protocol: "sse",
        server_label: "WebParser",
        server_description: "WebParser MCP service for parsing web page content.",
        server_url: "https://dashscope.aliyuncs.com/api/v1/mcps/WebParser/sse",
        headers: {
            "Authorization": "Bearer " + process.env.DASHSCOPE_API_KEY
        }
    };

    const response = await openai.responses.create({
        model: "qwen3.6-plus",
        input: "Which models are supported in https://help.aliyun.com/zh/model-studio/mcp ?",
        tools: [mcpTool]
    });

    console.log("[Model Response]");
    console.log(response.output_text);
    console.log(`\n[Token Usage] Input: ${response.usage.input_tokens}, Output: ${response.usage.output_tokens}, Total: ${response.usage.total_tokens}`);
}

main();
```

**Note:** For International Region, use `baseURL: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1"` and replace `server_url` with your MCP server's SSE endpoint.

### MCP Server Connection - curl - China Region

```bash
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.6-plus",
    "input": "Which models are supported in https://help.aliyun.com/zh/model-studio/mcp ?",
    "tools": [
        {
            "type": "mcp",
            "server_protocol": "sse",
            "server_label": "WebParser",
            "server_description": "WebParser MCP service for parsing web page content.",
            "server_url": "https://dashscope.aliyuncs.com/api/v1/mcps/WebParser/sse",
            "headers": {
                "Authorization": "Bearer your-api-key"
            }
        }
    ]
}'
```

**Note:** For International Region, use `https://dashscope-intl.aliyuncs.com/compatible-mode/v1/responses` as the endpoint.

### MCP Server Connection - Python - International Region

```python
import os
from openai import OpenAI

client = OpenAI(
    # If no environment variable, use: api_key="sk-xxx" (not recommended).
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1"
)

# Replace server_url with the SSE Endpoint that you got from a platform such as ModelScope
# If authentication is required, add the token from the corresponding platform to the headers
mcp_tool = {
    "type": "mcp",
    "server_protocol": "sse",
    "server_label": "fetch",
    "server_description": "Fetch MCP Server that provides web scraping capabilities. It can scrape the content of a specified URL and return it as text.",
    "server_url": "https://mcp.api-inference.modelscope.net/xxx/sse",
}

response = client.responses.create(
    model="qwen3.6-plus",
    input="https://news.aibase.com/zh/news, what is the AI news today?",
    tools=[mcp_tool]
)

print("[Model Response]")
print(response.output_text)
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")
```

## Response Format

```json
{
  "output_text": "Based on the documentation for the Model Context Protocol (MCP)...",
  "usage": {
    "input_tokens": 20583,
    "output_tokens": 1638,
    "total_tokens": 22221
  }
}
```

**Key Fields:**
- `output_text` - The model's generated response text
- `usage.input_tokens` - Number of tokens in the input prompt
- `usage.output_tokens` - Number of tokens in the model's response
- `usage.total_tokens` - Total tokens consumed (input + output)

### Streaming Chunk Format

For streaming responses, chunks contain:
- `event.type` - Event type identifier
- `event.delta` - Incremental text content
- `event.response.usage` - Token usage metrics (in final chunk)

## Error Handling

| Error Code | Description | Recommended Action |
|------------|-------------|-------------------|
| 400 | Bad Request - Invalid request body or missing required parameters. | Validate request structure and ensure all required fields are present. |
| 401 | Unauthorized - Invalid or missing API key or authentication token. | Verify DASHSCOPE_API_KEY is set correctly and not expired. |
| 404 | Not Found - The specified MCP server URL is unreachable or incorrect. | Check server_url is correct and the MCP server is running. |
| 429 | Too Many Requests - Rate limit exceeded. Wait before retrying. | Implement exponential backoff and retry after delay. |
| 500 | Internal Server Error - An unexpected error occurred on the server side. | Retry the request; if persistent, contact support. |

### Rate Limits & Retry

**Rate Limit:** 100 QPS (queries per second) per model

**Retry Strategy:**
- For 429 errors: Implement exponential backoff starting at 1 second
- For 500 errors: Retry up to 3 times with 2-second intervals
- Monitor `Retry-After` header if present in response

## Requirements

**SDK Dependencies:**
- Python: `openai>=1.0.0`, `dashscope>=1.14.0`
- Node.js: `openai` package (latest version)

**Environment Setup:**
```bash
export DASHSCOPE_API_KEY=your_api_key_here
```

**Installation:**
```bash
pip install openai>=1.0.0 dashscope>=1.14.0
```

## FAQ

**Q: Can I use the standard Chat Completions API with MCP tools?**

A: No, MCP is only supported via the Responses API (`client.responses.create`). The standard Chat Completions API does not support MCP tool configurations.

**Q: What protocol does MCP use to communicate with external servers?**

A: Currently, only SSE (Server-Sent Events) protocol is supported via the `server_protocol: "sse"` parameter. The MCP server must expose an SSE endpoint.

**Q: Do I need to authenticate with the MCP server separately?**

A: Yes, if the MCP server requires authentication, you must include the appropriate headers in the `headers` field of the MCP tool configuration. This is separate from the Bailian API authentication.

**Q: How many MCP servers can I configure in a single request?**

A: You can configure up to 10 MCP servers in the tools array for a single request.

**Q: Why is my MCP server returning a 404 error?**

A: A 404 error typically means the MCP server URL is incorrect or the server is not running. Verify the `server_url` points to a valid SSE endpoint and that the MCP server is accessible.

## Pricing & Billing

### Billing Model

MCP integration uses per-token billing for model inference. MCP server fees are subject to individual server billing rules and are separate from model inference costs.

### Price Reference

| Model | Input Price | Output Price |
|-------|-------------|--------------|
| qwen3.6-plus | 0.002 CNY per 1K tokens | 0.004 CNY per 1K tokens |
| qwen3.5-plus | 0.0015 CNY per 1K tokens | 0.003 CNY per 1K tokens |
| qwen3.6-flash | 0.001 CNY per 1K tokens | 0.002 CNY per 1K tokens |
| qwen3.5-flash | 0.0008 CNY per 1K tokens | 0.0016 CNY per 1K tokens |
| qwen3.6-open-source | 0.0005 CNY per 1K tokens | 0.001 CNY per 1K tokens |
| qwen3.5-open-source | 0.0004 CNY per 1K tokens | 0.0008 CNY per 1K tokens |

### Free Tier

1 million tokens free per month

### Usage Limits

Maximum 10 MCP servers per request

### Billing Notes

Model inference fees are billed based on token usage (input + output tokens). MCP server fees are charged separately according to each server's individual billing rules.

## Source Documents

- `MCP_5953719.xdita`