# oceanbase-sql_optimization

Part of **OCEANBASE**

# OceanBase SQL Optimization and Performance

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Create Outline | Synchronous | Create an outline for SQL optimization. |
| Optimize Distributed Query Execution | Synchronous | Manage and optimize distributed execution plans. |
| Analyze SQL Query Plan | Synchronous | Use EXPLAIN to examine the execution plan of a SQL query. |
| Control Join Algorithm | Synchronous | Use hints to influence the join algorithm selection. |
| Control Parallel Execution | Synchronous | Use hints to manage parallel query execution. |
| Control Query Execution Plan | Synchronous | Use policy-related hints to guide the optimizer's plan choices. |
| View Cached Execution Plans | Synchronous | Inspect execution plans stored in the plan cache. |
| View Query Execution Plan | Synchronous | Retrieve the execution plan for SQL queries to analyze performance characteristics. |
| Find Top Queries by Execution Time | Synchronous | Identify the SQL queries with the longest execution times within a specified period. |
| Monitor Query Execution Details | Synchronous | Track detailed execution plan statistics for individual SQL queries. |

## API Calling Patterns

### Authentication
OceanBase SQL optimization features are accessed directly through SQL statements executed against the database. No external API authentication is required beyond standard database credentials.

- Use your OceanBase database username and password to connect via MySQL or Oracle protocol clients.
- Credentials are typically provided during connection initialization (e.g., in `obclient -h<host> -P<port> -u<user>@<tenant> -p`).

### Service Endpoint
SQL optimization operations are performed directly on your OceanBase cluster endpoint:

- Connect to your OceanBase tenant using standard database connection methods
- Common regions include: cn-hangzhou, cn-shanghai, cn-beijing (use your actual cluster endpoint)

### Synchronous Pattern
All SQL optimization operations in OceanBase follow a synchronous pattern:
1. Establish a database connection using standard MySQL/Oracle client protocols
2. Execute SQL statements containing optimization commands (EXPLAIN, CREATE OUTLINE, etc.) or hints
3. Receive immediate results from the database engine
4. Parse the returned execution plan or confirmation message

For example:
- Use `EXPLAIN SELECT ...` to immediately get the execution plan
- Use `CREATE OUTLINE ...` to immediately create a plan binding
- Query system views like `gv$sql_audit` to immediately retrieve performance data

## Parameter Reference

### Create Outline

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| outline_name | string | true | null | null | The name of the outline to be created. |
| OR REPLACE | boolean | false | null | null | If the outline to be created already exists after you specify OR REPLACE, the original outline is replaced. |
| stmt | string | true | null | null | The value is generally a DML statement that contains hints and original parameters. |
| TO target_stmt | string | false | null | null | Assume that you do not specify TO target_stmt and the SQL statement accepted by the database is parameterized. If the parameterized SQL statement is the same as the parameterized text of stmt from which the hint is removed, the SQL statement is bound to the hint in stmt to generate an execution plan. If you need to generate a fixed plan for the statement that contains a hint, you must use TO target_stmt to specify the original SQL statement. |
| sql_id | string | true | null | null | If the SQL statement that corresponds to sql_id has a hint, the hint that you specify when you create the outline overwrite all the hints in the original statement. |
| hint | string | true | null | must be in /*+ xxx */ format | The format is /*+ xxx */. |

### Analyze SQL Query Plan

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| tbl_name | string | false | null | null | Specifies the table name. |
| col_name | string | false | null | null | Specifies the column name of the table. |
| wild | string | false | null | null | Wildcard pattern to match column names. |
| BASIC | flag | false | null | null | Specifies the basic information about the output plan, such as the operator ID, operator name, and referenced table name. |
| OUTLINE | flag | false | null | null | Specifies that the output plan information includes the outline information. |
| EXTENDED | flag | false | null | null | Specifies that the EXPLAIN statement generates additional information including input/output columns, partition info, and filter details. |
| EXTENDED_NOADDR | flag | false | null | null | Displays the additional information in a simplified format. |
| PARTITIONS | flag | false | null | null | Displays partition-related information. |
| FORMAT | enum | false | TRADITIONAL | one of: TRADITIONAL, JSON | Specifies the output format of EXPLAIN: TRADITIONAL or JSON. |

### View Cached Execution Plans

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| TENANT_ID | BIGINT(20) | true | null | null | The ID of the tenant. |
| IP | varchar(32) | true | null | null | The IP address of the OBServer. |
| PORT | BIGINT(20) | true | null | null | The port number of the OBServer. |
| PLAN_ID | BIGINT(20) | true | null | null | The ID of the plan. |

### Find Top Queries by Execution Time

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| tenant_id | integer | true | null | null | The ID of the tenant to filter queries for. |
| IS_EXECUTOR_RPC | integer | true | null | 0 or 1 | Filter out executor RPC calls. Set to 0 to exclude them. |
| request_time | timestamp | true | null | range: time_to_usec(now()) - 10000000 to time_to_usec(now()) | The time range for query execution. Filtered between current time minus 10 seconds and current time. |
| LIMIT | integer | false | 10 | max 100 | Limit the number of returned rows to the top N queries. |

## Code Examples

### Create Outline Using SQL Text - SQL - all

```sql
CREATE OUTLINE otl_idx_c2 
       ON SELECT/*+ index(t1 idx_c2)*/ * FROM t1 WHERE c2 = 1;
```

### Create Outline Using SQL ID - SQL - all

```sql
CREATE OUTLINE otl_idx_c2 
ON 'ED570339F2C856BA96008A29EDF04C74'
USING HINT /*+ index(t1 idx_c2)*/ ;
```

### Analyze Query Plan with EXPLAIN - Bash - all

```bash
obclient> EXPLAIN select count(*) from stok \G
```

### Find Top 10 Slow Queries - JavaScript - all

```javascript
obclient>SELECT/*+ PARALLEL(15)*/ sql_id, elapsed_time , trace_id    
     FROM oceanbase.gv$sql_audit     
     WHERE   tenant_id = 1001 
     and IS_EXECUTOR_RPC = 0 
     and request_time > (time_to_usec(now()) - 10000000) 
     AND request_time < time_to_usec(now()) 
     ORDER BY elapsed_time DESC LIMIT 10;
```

### Control Join Algorithm with USE_NL Hint - JavaScript - all

```javascript
SELECT /*+ USE_NL(l h) */ h.customer_id, l.unit_price * l.quantity
  FROM orders h, order_items l
  WHERE l.order_id = h.order_id;
```

### Control Parallel Execution with PARALLEL Hint - JavaScript - all

```javascript
SELECT /*+ PARALLEL(3) */ MAX(L_QUANTITY) FROM table_name;
```

### Distributed Query Optimization with PQ_DISTRIBUTE - SQL - all

```sql
obclient>EXPLAIN BASIC SELECT /*+USE_PX PARALLEL(3) PQ_DISTRIBUTE
        (t2 BROADCAST NONE) LEADING(t1 t2)*/ * FROM t1 JOIN t2 ON 
         t1.c2 = t2.c2\G;
```

### Query Policy Control with USE_JIT Hint - JavaScript - all

```javascript
SELECT /*+ USE_JIT*/ e.department_id, sum(e.salary)
 FROM employees e
 WHERE e.department_id = 1001;
 GROUP BY e.department_id;
```

## Response Format

```json
{
  "ID":2,
  "OPERATOR":"JOIN",
  "NAME":"JOIN",
  "EST.ROWS":9800999,
  "COST":5933108,
  "output": [
    "t1.c1",
    "t1.c2",
    "t2.c1",
    "t2.c2"
  ],
  "TABLE SCAN": {
    "ID":0,
    "OPERATOR":"TABLE SCAN",
    "NAME":"TABLE SCAN",
    "EST.ROWS":10000,
    "COST":6218,
    "output": [
      "t2.c2",
      "t2.c1"
    ]
  },
  "TABLE SCAN": {
    "ID":1,
    "OPERATOR":"TABLE SCAN",
    "NAME":"TABLE SCAN",
    "EST.ROWS":100000,
    "COST":68477,
    "output": [
      "t1.c2",
      "t1.c1"
    ]
  }
}
```

**Key Fields**:
- `ID` — Unique identifier for each operator in the execution plan
- `OPERATOR` — Type of operation being performed (e.g., HASH JOIN, TABLE SCAN)
- `NAME` — Name of the operator or referenced table
- `EST.ROWS` — Estimated number of rows processed by this operator
- `COST` — Estimated execution cost for this operator
- `output` — Columns produced by this operator
- `filter` — Conditions applied to filter rows
- `equal_conds` — Join conditions for equality-based joins
- `access` — Columns accessed during table scan operations
- `partitions` — Partition information for partitioned tables

## Error Handling

| Error Code | Description | Recommended Action |
|------------|-------------|-------------------|
| 400 | Invalid SQL syntax or unsupported EXPLAIN type. | Check your SQL syntax and ensure you're using supported EXPLAIN options. |
| 404 | Table or column not found. | Verify that the specified tables and columns exist in your schema. |
| 500 | Internal server error during plan generation. | Retry the request; if the issue persists, contact support with your query details. |

## FAQ

Q: How do I analyze why my query is running slowly?
A: Use the EXPLAIN statement with your query to view the execution plan. Look at the COST and EST. ROWS columns to identify expensive operations, and check if appropriate indexes are being used.

Q: What's the difference between CREATE OUTLINE with SQL_TEXT vs SQL_ID?
A: SQL_TEXT creates an outline based on the literal SQL statement text, while SQL_ID creates an outline based on the internal hash identifier of a previously executed query. SQL_ID is useful when you want to bind hints to a query that's already been parsed and executed.

Q: How can I force a specific join algorithm?
A: Use join hints like USE_NL (nested loop), USE_HASH (hash join), or USE_MERGE (merge join) in your query. For example: `SELECT /*+ USE_HASH(t1 t2) */ * FROM t1 JOIN t2 ON t1.id = t2.id`.

Q: How do I find the most resource-intensive queries in my system?
A: Query the gv$sql_audit system view with appropriate filters on tenant_id and time range, ordering by elapsed_time in descending order to get the top N slowest queries.

Q: Are there any costs associated with using these optimization features?
A: Most SQL optimization features like EXPLAIN and hints are included in standard OceanBase licensing with no additional charges. However, querying system views like gv$sql_audit may be subject to usage-based billing in some deployment scenarios.

## Pricing & Billing

### Billing Model
per_request

### Price Reference

| Tier | Input Price | Output Price |
|------|-------------|--------------|
| standard_query | 0.0001 / | 0.0001 / |

### Free Tier
 1000 

### Usage Limits
100 QPS per tenant

### Billing Notes
Queries on gv$sql_audit are charged based on execution count; long-running queries may incur higher costs due to resource usage.