# oceanbase-sql

Part of **OCEANBASE**

<!-- intent-backlink:auto -->

> 💡 **Path Selection**: This skill is one implementation path for the following routing skills. If you're unsure which path to take, check the corresponding routing skill:

> - [Manage distributed database transactions](../../intent/oceanbase-manage-transactions/SKILL.md)
> - [Optimize a slow-running SQL query](../../intent/oceanbase-optimize-query/SKILL.md)

# OceanBase SQL Execution and Querying

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|----------------|--------------|-------------|
| Execute Dynamic SQL | Synchronous | Execute dynamic SQL statements using DBMS_SQL functions. |
| Group Data by Column | Synchronous | Use GROUP BY to aggregate data based on column values. |
| Query SQL Workarea Statistics | Synchronous | Access statistics about SQL workarea memory usage. |
| Test for NULL Value | Synchronous | Check if a value is NULL in SQL queries. |
| Check If Subquery Returns Rows | Synchronous | Use EXISTS condition to test whether a subquery returns any rows. |
| Check Membership in List or Subquery | Synchronous | Use IN condition to test if a value exists in a list or subquery result. |
| Perform Difference Set Operation | Synchronous | Use EXCEPT/MINUS to find rows in one query result that are not in another. |
| Perform Hierarchical Query | Synchronous | Execute hierarchical queries using hierarchical query operators. |
| Remove Duplicate Rows | Synchronous | Remove duplicate rows from query results using DISTINCT. |
| Exchange Data Between Threads | Synchronous | Use the EXCHANGE operator to transfer data between parallel execution threads. |
| Execute Parallel Query with Partition or Block Iteration | Synchronous | Leverage GI operators for efficient parallel query execution over partitions or blocks. |
| View Parallel Query Worker Stats | Synchronous | Monitor statistics for parallel query execution workers. |
| Monitor Parallel Execution Workers | Synchronous | Track the status and performance of parallel query execution workers. |
| Query Hierarchical Data | Synchronous | Execute queries on data with hierarchical relationships. |
| Query Table Data | Synchronous | Search for and retrieve data from tables that matches specified conditions. |

## API Calling Patterns

### Authentication
OceanBase SQL execution does not require explicit API authentication when executed within the database context (e.g., via `obclient` or JDBC/ODBC connections). For programmatic access through application drivers:

- Use standard database connection credentials (username/password)
- Connection strings typically follow: `mysql://user:password@host:port/database`
- No HTTP `Authorization` header is used for direct SQL execution

For REST-based interfaces (if available), use:
- `Authorization: Bearer <your_api_key>`
- Environment variable: `OCEANBASE_API_KEY`

However, most SQL operations are performed directly against the database engine without REST API intermediaries.

### Service Endpoint (Endpoint)
OceanBase SQL is executed directly against the database server rather than through HTTP APIs. Common connection endpoints include:

- Direct TCP connection to OBServer nodes
- MySQL-compatible protocol endpoint: `mysql://<user>:<password>@<host>:<port>/<database>`
- Common regions for Alibaba Cloud OceanBase instances: cn-hangzhou, cn-shanghai, cn-beijing

For cloud deployments, use the connection string provided in the OceanBase console.

### Synchronous Execution Pattern
All SQL operations in OceanBase follow a synchronous execution model:

1. **Submit Query**: Send SQL statement via database client (e.g., `obclient`, JDBC, ODBC)
2. **Execute Immediately**: Database parses, optimizes, and executes the query
3. **Return Results**: Full result set is returned upon completion (no polling required)
4. **Handle Response**: Process rows, status messages, or error codes

This pattern applies to all query types including:
- Simple SELECT statements
- Aggregation (GROUP BY)
- Set operations (EXCEPT/MINUS)
- Hierarchical queries
- Dynamic SQL execution
- System view queries (gv$, v$)

## Parameter Reference

### Query Table Data

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| select_list | string | Yes | - | - | The list of columns to retrieve from the table. |
| table_list | string | Yes | - | - | The name(s) of the table(s) to query. |
| query_condition | string | No | - | - | A condition to filter rows, typically using comparison operators and logical expressions. |

### Query Hierarchical Data

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| CONNECT BY | clause | Yes | - | - | Specifies how to determine the parent-child relationship. An equivalent expression is usually used. |
| START WITH | clause | Yes | - | - | Specifies the root row in the hierarchical query. |
| PRIOR | operator | Yes | - | - | Unary operator indicating columns come from the parent row. |
| LEVEL | pseudocolumn | No | - | - | Node level (1 for root, 2 for children, etc.). |
| CONNECT_BY_ISLEAF | pseudocolumn | No | - | - | 1 if current row is a leaf node, 0 otherwise. |
| CONNECT_BY_ISCYCLE | pseudocolumn | No | - | - | 1 if current row is in a loop, 0 otherwise. |
| NOCYCLE | keyword | No | - | - | Allows query to return results even with loops. |
| ORDER SIBLINGS BY | clause | No | - | - | Sorts rows of the same level. |

### Perform Hierarchical Query

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| PRIOR | operator | No | - | - | Used in CONNECT BY conditions to reference the parent row's value. |
| CONNECT_BY_ROOT | operator | No | - | Cannot be used in START WITH or CONNECT BY conditions | Returns the value from the root row of the hierarchy. |

### Check If Subquery Returns Rows

| Parameter | Type | Required | Default | Constraints | Description |
|-----------|------|----------|---------|-------------|-------------|
| subquery | string | Yes | - | - | A subquery that returns rows to be checked for existence. |

## Code Examples

### Basic Table Query - Bash - All Regions

```bash
obclient> SELECT * FROM ordr WHERE o_w_id=2 and o_d_id=5 ;
```

### Group By with Aggregation - JavaScript - All Regions

```javascript
obclient>CREATE TABLE t1(c1 INT, c2 INT);
Query OK, 0 rows affected (0.12 sec)

obclient>INSERT INTO t1 VALUES(1, 1);
Query OK, 1 rows affected (0.12 sec)

obclient>INSERT INTO t1 VALUES(2, 2);
Query OK, 1 rows affected (0.12 sec)

obclient>INSERT INTO t1 VALUES(3, 3);
Query OK, 1 rows affected (0.12 sec)

Q2: 
obclient>EXPLAIN SELECT SUM(c2) FROM t1 GROUP BY c1 HAVING SUM(c2) > 2\G;
*************************** 1. row ***************************
Query Plan:
| =====================================
| ID / OPERATOR / NAME / EST. ROWS / COST |
--------------------------------------
|0 |HASH GROUP BY|    |1        |40  |
|1 | TABLE SCAN  |T1  | 3 / 37 |
======================================

Outputs & filters: 
-------------------------------------
  0 - output([T_FUN_SUM(T1.C2)]), filter([T_FUN_SUM(T1.C2) > 2]), 
      group([T1.C1]), agg_func([T_FUN_SUM(T1.C2)])
  1 - output([T1.C1], [T1.C2]), filter(nil), 
      access([T1.C1], [T1.C2]), partitions(p0)
```

### Hierarchical Query Example - JavaScript - All Regions

```javascript
CREATE TABLE emp(emp_id INT,position VARCHAR(50),mgr_id INT);
INSERT INTO emp VALUES (1,'Global Manager',NULL);
INSERT INTO emp VALUES (2,'Europe Regional Manager',1);
INSERT INTO emp VALUES (3,'Asia Pacific Regional Manager',1);
INSERT INTO emp VALUES (4,'Americas Regional Manager',1);
INSERT INTO emp VALUES (5,'Italy Regional Manager',2);
INSERT INTO emp VALUES (6,'France Regional Manager',2);
INSERT INTO emp VALUES (7,'China Regional Manager',3);
INSERT INTO emp VALUES (8,'Korea Regional Manager',3);
INSERT INTO emp VALUES (9,'Japan Regional Manager',3);
INSERT INTO emp VALUES (10,'US Regional Manager',4);
INSERT INTO emp VALUES (11,'Canada Regional Manager',4);
INSERT INTO emp VALUES (12,'Beijing Regional Manager',7);

SELECT emp_id, mgr_id, position, level FROM emp
START WITH mgr_id IS NULL CONNECT BY PRIOR emp_id = mgr_id;
```

### Set Difference Operation - Bash - All Regions

```bash
obclient>CREATE TABLE t1(c1 INT PRIMARY KEY, c2 INT);
Query OK, 0 rows affected (0.12 sec)

obclient>INSERT INTO t1 VALUES(1,1);
Query OK, 1 rows affected (0.12 sec)

obclient>INSERT INTO t1 VALUES(2,2);
Query OK, 1 rows affected (0.12 sec)

Q1: 
obclient>EXPLAIN SELECT c1 FROM t1 MINUS SELECT c2 FROM t1\G;
*************************** 1. row ***************************
Query Plan:
==============================================
|ID|OPERATOR             |NAME|EST. ROWS|COST|
----------------------------------------------
|0 |MERGE EXCEPT DISTINCT|    |2        |77  |
|1 | TABLE SCAN          |T1  | 2 / 37 |
|2 | SORT                |    |2        |39  |
|3 |  TABLE SCAN         |T1  | 2 / 37 |
==============================================
Outputs & filters: 
-------------------------------------
  0 - output([MINUS(T1.C1, T1.C2)]), filter(nil)
  1 - output([T1.C1]), filter(nil), 
      access([T1.C1]), partitions(p0)
  2 - output([T1.C2]), filter(nil), sort_keys([T1.C2, ASC])
  3 - output([T1.C2]), filter(nil), 
      access([T1.C2]), partitions(p0)
```

### Parallel Query with Exchange Operator - JavaScript - All Regions

```javascript
obclient>CREATE TABLE t (c1 INT, c2 INT) PARTITION BY HASH(c1) PARTITIONS 5;
Query OK, 0 rows affected (0.12 sec)

obclient>EXPLAIN SELECT * FROM t\G;
*************************** 1. row ***************************
Query Plan:
==============================================
|ID|OPERATOR           |NAME|EST. ROWS|COST  |
----------------------------------------------
|0 |EXCHANGE IN DISTR  |    |500000   |545109|
|1 | EXCHANGE OUT DISTR|    |500000   |320292|
|2 |  TABLE SCAN       |T   | 500000 / 320292 |
==============================================

Outputs & filters:
-------------------------------------
  0 - output([T.C1], [T.C2]), filter(nil)
  1 - output([T.C1], [T.C2]), filter(nil)
  2 - output([T.C1], [T.C2]), filter(nil),
      access([T.C1], [T.C2]), partitions(p[0-4])
```

### NULL Value Testing - JavaScript - All Regions

```javascript
SELECT last_name FROM employees WHERE commission_pct IS NULL
ORDER BY last_name;
```

### Membership Test with IN Condition - JavaScript - All Regions

```javascript
SELECT * FROM employees WHERE job_id IN ('PU_CLERK','SH_CLERK') ORDER BY employee_id;

SELECT * FROM employees WHERE salary IN (SELECT salary  FROM employees 
WHERE department_id =30) ORDER BY employee_id;
```

## Response Format (Response Format)

```json
{
  "rows": [
    {
      "o_w_id": 2,
      "o_d_id": 5,
      "o_id": 2100,
      "o_c_id": 2100,
      "o_carrier_id": 5,
      "o_ol_cnt": 12,
      "o_all_local": 1,
      "o_entry_d": "2020-02-15"
    },
    {
      "o_w_id": 2,
      "o_d_id": 5,
      "o_id": 2101,
      "o_c_id": 4,
      "o_carrier_id": null,
      "o_ol_cnt": 11,
      "o_all_local": 1,
      "o_entry_d": "2020-02-15"
    }
  ],
  "row_count": 3
}
```

**Key Fields**:
- `rows` — Array of result records with column values
- `row_count` — Total number of rows returned
- `o_w_id`, `o_d_id`, `o_id`, `o_c_id`, `o_carrier_id`, `o_ol_cnt`, `o_all_local`, `o_entry_d` — Example column fields from the ordr table

## Error Handling (Error Handling)

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|-------------------|---------------------------|----------------------------------------|
| 400 | Invalid SQL syntax. Check the SELECT and WHERE clauses for correct formatting. | Review SQL syntax, ensure proper quoting and operator usage. |
| 404 | Table not found. Verify the table name is correct and exists in the database. | Confirm table exists and check for typos in table name. |
| 500 | Internal server error. Retry the request or contact support if the issue persists. | Retry the query; if persistent, check database logs or contact support. |
| 422 | Invalid subquery syntax or malformed SQL statement. | Validate subquery structure and ensure it returns compatible data types. |
| ORA-01436 | Loop detected in hierarchical query. ApsaraDB for OceanBase returns this error when non-equality operators in CONNECT BY clauses cause circular references. | Use equality operators in CONNECT BY clauses or add NOCYCLE keyword. |

## FAQ

Q: How do I execute dynamic SQL in OceanBase?
A: Use the DBMS_SQL package with procedures like OPEN_CURSOR, PARSE, BIND_VARIABLE, EXECUTE, and FETCH_ROWS to construct and execute SQL statements at runtime.

Q: What's the difference between EXCEPT and MINUS in OceanBase?
A: EXCEPT and MINUS are functionally equivalent in OceanBase and both perform set difference operations, removing rows from the first result set that appear in the second result set.

Q: How can I optimize GROUP BY queries in OceanBase?
A: OceanBase supports three GROUP BY implementations: SCALAR GROUP BY (for no grouping columns), HASH GROUP BY (hash-based aggregation), and MERGE GROUP BY (sort-based aggregation). Use query hints like /*+USE_HASH_AGGREGATION*/ or /*+NO_USE_HASH_AGGREGATION*/ to control the execution method.

Q: How do I handle NULL values in WHERE conditions?
A: Use IS NULL or IS NOT NULL operators to test for NULL values, as standard comparison operators (=, !=, etc.) return UNKNOWN when comparing with NULL.

Q: What system views can I use to monitor parallel query execution?
A: Query gv$tenant_px_worker_stat or v$tenant_px_worker_stat to monitor PX worker thread statistics, including session ID, tenant ID, server information, and execution context details.

## Pricing & Billing

### Billing Model
Per-request billing model where each SQL query execution is charged based on input and output operations.

### Price Reference

| Tier/Specification | Input Price | Output Price |
|--------------------|-------------|--------------|
| standard | 0.0001 / | 0.0001 / |
| default | 0.0001 / | 0.0002 / |

### Free Tier
Monthly free quota of 1000 queries.

### Usage Limits
Maximum of 10 queries per second.

### Billing Notes
Query costs are calculated based on actual execution count, with each query including both input and output components in the billing calculation.