# rds-ml

Part of **RDS**

# ApsaraDB RDS Advanced Database Features Console Guide

## Operations Overview

| Operation | Console Navigation Path | Prerequisites | Description |
|----------|------------------------|---------------|-------------|
| Train Machine Learning Model | Console > RDS > Instances > Create Instance | ECS instance in same region/VPC; basic SQL/Linux knowledge; SSH access to RDS Custom instance | Create an RDS Custom instance with SQLFlow image and train DNNClassifier models via SQL |
| Enable MADlib Plugin | Console > ApsaraDB RDS > Instances > Select Instance > Database Management > Extensions | RDS for PostgreSQL 11 or 12; privileged account | Activate the MADlib in-database machine learning library via SQL |
| Install TimescaleDB Extension | Instances > Select instance > Plugin Settings > Plugins tab > TimescaleDB extension > Install | PostgreSQL 10+; engine version ≥20230330; instance in Running status | Install TimescaleDB for time-series data with hypertables and automatic sharding |
| Enable Molecular Search | Console > RDS > Instances > Select Instance > Databases > SQL Console | RDS for PostgreSQL 12 | Enable RDKit extension for molecular storage, similarity search, and substructure matching |
| Implement Retrieval-Augmented Generation (RAG) | Instances > Select region > Instance ID > Plugins > Install rds_ai extension | Privileged PostgreSQL account; Alibaba Cloud Model Studio API key; external network access | Deploy RAG applications using rds_ai extension with SQL-only workflows |
| Download and Install PPAS Driver | — | — | Obtain PPAS development drivers for Java, OCI, ODBC, and .Net platforms |

## Step-by-Step Instructions

### Train Machine Learning Model

**Navigation**: Console > RDS > Instances > Create Instance

**Prerequisites**:
- An ECS instance in the same region and VPC as the RDS Custom instance
- Basic knowledge of SQL and Linux command line
- Access to the RDS Custom instance via SSH

1. Click **Create Instance** (button) — top-right corner
   - Notes: This starts the RDS instance creation wizard

2. Set the **Instance Class** (dropdown) — instance configuration section
   - Notes: Choose a GPU architecture instance type

3. Select **Custom Image** (dropdown) — custom image selection list
   - Notes: Choose "SQLFlow" from the available images

4. After creation, go to the instance details panel and click **SSH Connection** (link)
   - Notes: Use the private IP address of the RDS Custom instance to connect

5. In the terminal, run the **start_sqlflow.sh** script (text_input) — command line interface
   - Notes: Command example: `./start_sqlflow.sh -P 50054 -u aliyun_rds -p aliyun_rds`. This starts MySQL and the RDS SQLFlow container.

6. On your ECS instance, download the **sqlflow** client (file) — ECS instance file system
   - Notes: Use `wget` or `scp` from the RDS instance

7. Grant executable permissions using **chmod +x sqlflow** (text_input) — command line interface

8. Create a **ca.crt** file (file) — ECS instance file system
   - Notes: Paste the TLS certificate content output from Step 5

9. Connect to SQLFlow service using the **sqlflow** client (text_input) — command line interface
   - Notes: Use parameters: `-c ca.crt`, `-d 'mysql://user:password@tcp(host:port)/db?maxAllowedPacket=0'`, `-s sqlflowHost:sqlflowPort`

10. In SQLFlow CLI, create the iris database using **CREATE DATABASE** (text_input)
    - Notes: Define tables and insert sample data via SQL

11. Train the model using **TO TRAIN DNNClassifier** (text_input) — SQLFlow CLI
    - Notes: Specify `model.n_classes`, `model.hidden_units`, `optimizer.learning_rate`, `train.epoch`, feature columns, label column, and output table

12. Evaluate the model using **TO EVALUATE** (text_input) — SQLFlow CLI
    - Notes: Save results to a table like `sqlflow_models.evaluate_result_table`

13. Make predictions using **TO PREDICT** (text_input) — SQLFlow CLI
    - Notes: Write prediction results to a new table such as `iris.predict.class`

### Enable MADlib Plugin

**Navigation**: Console > ApsaraDB RDS > Instances > Select Instance > Database Management > Extensions

**Prerequisites**:
- ApsaraDB RDS for PostgreSQL instance with engine version PostgreSQL 11 or PostgreSQL 12
- A privileged account to connect to the RDS instance

1. In the left navigation panel, click the **Extensions** (tab)
   - Notes: This opens the list of available and installed extensions

2. Locate **MADlib** (checkbox) — Extensions list
   - Notes: If not enabled, you must run SQL commands manually (not via console UI)

3. Connect to the database via SQL Console and run:
   ```sql
   CREATE EXTENSION plpythonu;
   CREATE EXTENSION madlib;
   ```
   - Element: **Run** (button) — top-right of SQL editor

### Install TimescaleDB Extension

**Navigation**: Instances > Select instance > Plugin Settings > Plugins tab > TimescaleDB extension > Install

**Prerequisites**:
- Major version of the instance is PostgreSQL 10 or later
- Minor engine version is 20230330 or later (20241030 for PostgreSQL 17)
- Instance must be in Running status before installing extension

1. Click **Instances** (link) — top navigation bar
   - Notes: Select the appropriate region first

2. Click the **Instance ID** (link) — main content area
   - Notes: Opens the instance details page

3. In the left-side navigation pane, click **Plugin Settings** (menu)

4. On the **Plugins** (tab), find **TimescaleDB extension** (link) and click it
   - Notes: A dialog box appears

5. In the dialog box, select values for **Target database** (dropdown) and **Privileged account** (dropdown), then click **Install** (button)
   - Notes: Wait for instance status to change from INS_MAINTAINING to Running (refresh page if needed)

| Parameter | Type | Required | Options/Values | Description |
|-----------|------|----------|----------------|-------------|
| Target database | dropdown | Yes | — | The database in which the TimescaleDB extension will be installed |
| Privileged account | dropdown | Yes | — | A database user with sufficient privileges to install extensions |

### Enable Molecular Search

**Navigation**: Console > RDS > Instances > Select Instance > Databases > SQL Console

**Prerequisites**:
- An RDS instance running PostgreSQL 12

1. Click **SQL Console** (link) — main content area
   - Notes: Ensure you have permissions to execute DDL commands

2. In the SQL editor, enter `CREATE EXTENSION rdkit;` and click **Run** (button) — top-right corner of the SQL editor
   - Notes: This enables cheminformatics functions like molecular similarity and substructure search

3. (Optional) Adjust similarity thresholds using **SET rdkit.tanimoto_threshold = value** (text_input) — SQL editor
   - Notes: Default threshold is 0.5; valid range typically 0.0–1.0

### Implement Retrieval-Augmented Generation (RAG)

**Navigation**: Instances > Select region > Instance ID > Plugins > Install rds_ai extension

**Prerequisites**:
- A privileged account for RDS for PostgreSQL has been created
- Alibaba Cloud Model Studio has been activated and an API key obtained
- Network is configured to allow external model access

1. Click **Instances** (link) — top navigation bar
   - Notes: First select the correct region

2. Click the **Instance ID** (link) — main content area

3. In the left-side navigation pane, click the **Plugins** (tab)

4. Find the **rds_ai** extension row and click **Install** (button)
   - Notes: A dialog appears prompting for database and account selection

5. Wait for **Instance Status** (label) — top of instance details panel — to change from INS_MAINTAINING to Running
   - Notes: Installation takes about one minute; refresh the page to verify

6. After installation, connect via SQL Console and configure the API key:
   ```sql
   SELECT rds_ai.update_model(model_name,'token','sk-****') FROM rds_ai.model_list;
   SET http.timeout_msec TO 200000;
   SELECT http.http_set_curlopt('CURLOPT_TIMEOUT', '200000');
   ```

7. Use SQL functions like `rds_ai.embed()`, `multi_retrieve()`, and `rag()` for end-to-end RAG workflows

### Download and Install PPAS Driver

**Navigation**: — (No console steps; direct download)

**Prerequisites**: None

1. Visit the official PPAS driver download page (external link)
   - Notes: Drivers are provided for Linux and Windows

2. Download the appropriate driver package:
   - **Java driver** (link)
   - **OCI driver** (link)
   - **ODBC driver** (link)
   - **.Net driver** (link)

3. Install the driver following platform-specific instructions
   - Notes: No console interaction required; this is a client-side setup

## FAQ

Q: Where do I find the SQLFlow custom image when creating an RDS instance?
A: During instance creation, in the "Custom Image" dropdown under the instance configuration section, select "SQLFlow" from the list of available managed AI images.

Q: Can I enable MADlib directly from the console without SQL?
A: No. While the Extensions tab shows MADlib status, activation requires executing `CREATE EXTENSION madlib;` via SQL Console after enabling `plpythonu`.

Q: What happens if I try to install TimescaleDB on an unsupported PostgreSQL version?
A: The **Install** button will be disabled or grayed out. Ensure your instance runs PostgreSQL 10+ with engine version ≥20230330.

Q: Do I need to restart my RDS instance after installing rds_ai or RDKit?
A: No. The installation process handles internal reloading. However, the instance temporarily enters INS_MAINTAINING status and returns to Running automatically.

Q: Are PPAS drivers available through the RDS console?
A: No. PPAS drivers must be downloaded separately from documentation-provided links and installed on your client machine.

## Pricing & Billing

### Billing Model
- TimescaleDB: billed per instance hour
- RAG (rds_ai): billed per request (per token for embedding and LLM calls)
- MADlib, RDKit, SQLFlow core: free to use (standard RDS instance charges apply)

### Price Reference

| Tier | Input Price | Output Price |
|------|-------------|--------------|
| text-embedding-v3 | 0.0001 /tokens | 0.0002 /tokens |
| qwen-plus | 0.002 /tokens | 0.004 /tokens |

### Free Tier
 100 tokens 

### Billing Notes
- API calls are billed per request; async tasks are billed on completion
- Standard RDS instance fees apply regardless of extension usage
- Free tier applies only to rds_ai model invocations, not underlying compute resources