# Onboard

Part of **DATAWORKS**. Route queries via `POST https://www.company-skill.com/api/route`.

## What You Want to Do

You want to use DataWorks but aren't sure which module to start with. DataWorks has 4 primary entry points:

- **数据集成 (Data Integration)** — move data between sources (MaxCompute, OSS, RDS, Hologres, ...)
- **数据开发 (Data Development)** — author + schedule SQL / Python / Shell tasks
- **数据治理 (Data Governance)** — data quality, lineage, catalog, security
- **数据服务 (Data Services)** — expose data as REST APIs

## Decision Tree

```
- Need to move data between systems (sync, replicate, ETL)
  -> 数据集成 / Data Integration

- Need to author and schedule a task that runs periodically
  -> 数据开发 / Data Development (DataStudio + Operation Center)

- Need to track data quality, lineage, or build a data catalog
  -> 数据治理 / Data Governance

- Need to expose data as a callable REST API
  -> 数据服务 / Data Services

- Just starting and unsure -> 数据开发 (most common entry)
```

## Paths Comparison

| Module | Best for | Prerequisites | Output | Trade-off |
|--------|----------|---------------|--------|-----------|
| 数据集成 | Cross-system sync (RDS to MaxCompute etc.) | Source + target data sources registered | Batch/real-time sync tasks | Limited compute logic — pure data movement |
| 数据开发 | SQL/Python/Shell with scheduling | Workspace created | Periodic scheduled tasks with dependency DAG | Heavier setup than ad-hoc query |
| 数据治理 | Lineage / quality / catalog visibility | Tasks already producing tables | Lineage graph, quality alerts | Reactive, not generative |
| 数据服务 | Building data APIs for downstream apps | Tables/views ready | REST endpoints with versioning | Limited to read APIs |

## FAQ

**Q: I'm new to DataWorks — which module to learn first?**
A: 数据开发 (Data Development). It's the heart of DataWorks. Once you can run a scheduled SQL task, the other modules click into place.

**Q: I picked Data Integration but really need to transform the data — wrong choice?**
A: Partially. Integration can do basic mapping/filtering. For complex transforms, chain it with a Data Development task downstream.

**Q: Can I skip Governance entirely?**
A: For prototyping, yes. For production with >5 tables consumed by other teams, no — lineage + quality alerts catch problems before they propagate.

**Q: What's the difference between Data Services and a custom Flask app?**
A: Data Services handles versioning, traffic control, perms, monitoring — all the "boring API platform stuff" — without code. You write SQL, it exposes the endpoint.

## Prerequisites

- Aliyun account with DataWorks service activated
- RAM permission: AliyunDataWorksFullAccess (or equivalent module-scoped)
- A workspace created (DataWorks 控制台 > 工作空间列表 > 新建工作空间)

## Next Step

Once you've picked a module, jump to the corresponding detail skill — e.g., **dataworks-bigdata** covers the console operation flow across all modules.

## Related queries

dataworks, 数据工厂, 数据开发, 数据集成, data integration, data ingestion, 任务调度, scheduling, 数据治理, data governance, 数据质量, 数据血缘, data lineage, 数据资产, 数据地图, data catalog, 数据服务, data api, dataworks workspace, dataworks 工作空间, dataworks 入门, dataworks getting started, dataworks 选择模块, dataworks 控制台, dataworks console

---
Part of [DATAWORKS](https://www.company-skill.com/p/dataworks.md) · https://www.company-skill.com/llms.txt
