# eb-monitoring

Part of **EB**

<!-- intent-backlink:auto -->

> 💡 **Path Selection**: This skill is one implementation path for [Monitor event streams and set up alerts](../../intent/eb-monitor-alerts/SKILL.md). If you're unsure which path to take, check the routing skill first.

# EventBridge Monitoring and Alerting

## Capabilities Overview

| Sub-capability | Calling Mode | Description |
|--------|----------|------|
| Publish Monitoring Events | Synchronous | Receive and process monitoring events from various Alibaba Cloud services. |
| ActionTrail Events | Synchronous | Receive audit log events from ActionTrail service. |
| Alibaba Cloud DNS PrivateZone Events | Synchronous | Receive DNS-related security and performance events. |
| Alibaba Cloud DNS Events | Synchronous | Monitor DNS security events and anomalies. |
| AnalyticDB for MySQL Events | Synchronous | Receive database monitoring events from AnalyticDB. |
| Anti-DDoS Events | Synchronous | Receive DDoS attack detection and mitigation events. |
| ApsaraDB RDS Events | Synchronous | Monitor relational database service events. |
| ApsaraDB Instance Events | Synchronous | Receive events for various database instances including MySQL and MongoDB. |
| ApsaraDB for Cassandra Events | Synchronous | Receive NoSQL database events from Cassandra instances. |
| ApsaraDB for HBase Events | Synchronous | Receive big data database events from HBase instances. |
| ApsaraVideo VOD Events | Synchronous | Receive video-on-demand processing events. |
| Auto Scaling Events | Synchronous | Receive compute scaling activity events. |
| Blockchain as a Service Events | Synchronous | Receive blockchain transaction and network events. |
| BatchCompute Events | Synchronous | Receive batch job execution status events. |
| CDN Events | Synchronous | Receive content delivery network performance and cache events. |
| Cloud Firewall Events | Synchronous | Receive network security and firewall policy events. |
| City Visual Intelligence Engine Events | Synchronous | Receive smart city analytics events. |
| CloudConfig Events | Synchronous | Receive configuration compliance and resource change events. |
| ACK Events | Synchronous | Receive Kubernetes cluster and pod events from ACK. |
| Receive E-MapReduce Event Notifications | Synchronous | Receive event notifications from E-MapReduce service. |
| Record Cloud Security Scanner Events | Synchronous | Record events from Cloud Security Scanner service. |
| Publish Web Application Firewall Events | Synchronous | Publish Web Application Firewall events to EventBridge. |
| Monitor Cloud Enterprise Network Events | Synchronous | Monitor Cloud Enterprise Network service events. |
| Monitor VPN Gateway Events | Synchronous | Monitor VPN Gateway service events. |
| Receive Resource Management Events | Synchronous | Receive events from Resource Management service. |
| Monitor RTC Events | Synchronous | Monitor Real-Time Communication service events. |
| Publish Video Edge Intelligence Events | Synchronous | Publish events from Video Edge Intelligence Service to EventBridge. |
| Receive Visual Intelligence Events | Synchronous | Receive events from Visual Intelligence API service. |
| Publish IMM Events | Synchronous | Publish Intelligent Media Management service events. |
| Monitor DTS Job Status | Synchronous | Monitor Data Transmission Service job status events. |
| Publish Data Disaster Recovery Events | Synchronous | Publish events from Data Disaster Recovery service. |
| Receive Direct Mail Events | Synchronous | Receive events from Direct Mail service. |
| Publish Container Registry Events | Synchronous | Publish Container Registry events to EventBridge. |
| Check Service Linked Role Status | Synchronous | Verify if the required service-linked role exists and is properly configured. |
| Create Service Linked Role | Synchronous | Create the service-linked role required for EventBridge operations. |
| Monitor E-HPC Cluster Events | Synchronous | Monitor Elastic High Performance Computing cluster events. |
| Monitor EBS Events | Synchronous | Monitor Elastic Block Storage events. |
| Monitor ECS Events | Synchronous | Monitor Elastic Compute Service instance events. |
| Publish Function Compute Events | Synchronous | Publish Function Compute service events. |
| Monitor PolarDB Events | Synchronous | Monitor PolarDB instance events. |
| Monitor PolarDB-X Events | Synchronous | Monitor cloud-native distributed database PolarDB-X resource changes. |
| Monitor ROS Stack Events | Synchronous | Monitor Resource Orchestration Service stack events. |
| Monitor Tair Events | Synchronous | Monitor Tair (Redis-compatible) instance events. |

## API Calling Patterns

### Authentication
The primary authentication method for EventBridge APIs is bearer token authentication.

- Header format: `Authorization: Bearer <your_api_key>`
- Environment variable: `DASHSCOPE_API_KEY`
- Note: Some EventBridge monitoring endpoints do not require authentication as they are used for receiving events rather than making API calls.

### Service Endpoint
EventBridge APIs use region-specific endpoints with the following pattern:

`https://eventbridge.{region}.aliyuncs.com`

Common regions include:
- cn-hangzhou
- cn-shanghai  
- cn-beijing

For global operations, some APIs may use `https://eventbridge.aliyuncs.com`.

### Synchronous Event Processing
Most EventBridge monitoring and alerting APIs follow a synchronous calling pattern:

1. Events are automatically published by Alibaba Cloud services to EventBridge
2. Your application subscribes to these events via EventBridge rules
3. When an event occurs, EventBridge delivers it immediately to your target (HTTP endpoint, function, queue, etc.)
4. Your application processes the event payload in real-time

Key characteristics:
- No polling required - events are pushed immediately
- Standard CloudEvents 1.0 format for all events
- Includes metadata like `id`, `source`, `type`, `time`, and `data`
- Events contain service-specific details in the `data` field

## Parameter Reference

### Common Event Parameters

| Parameter | Type | Required | Default | Constraints | Description |
|------|------|------|--------|------|------|
| id | string | true | | | A unique identifier for the event |
| source | string | true | | | The source of the event, indicating the service that generated it |
| specversion | string | true | | | The version of the CloudEvents specification used |
| type | string | true | | | The type of the event, which determines the specific event category |
| time | string | true | | ISO 8601 format | The timestamp when the event was generated |
| data | object | true | | | The main payload containing event-specific data |
| datacontenttype | string | true | | | The content type of the data field, typically application/json |
| subject | string | false | | | The ARN of the resource that triggered the event |
| aliyunaccountid | string | false | | | The Alibaba Cloud account ID associated with the event |
| aliyunregionid | string | false | | | The region where the event occurred |

### Service-Specific Parameters

#### ECS Events
| Parameter | Type | Required | Default | Constraints | Description |
|------|------|------|--------|------|------|
| type | string | true | | ecs:Disk:ConvertToPostpaidCompleted, ecs:Instance:BurstablePerformanceRestricted, etc. | The event type identifier for ECS operations |

#### Auto Scaling Events
| Parameter | Type | Required | Default | Constraints | Description |
|------|------|------|--------|------|------|
| type | string | true | | ess:ScalingActivity:ScaleInError, ess:LifecycleHook:ScaleIn, etc. | The type of scaling event |
| data.cause | string | false | | | The reason for the scaling activity |
| data.scalingActivityId | string | false | | | The unique ID of the scaling activity |
| data.requestId | string | false | | | The request ID associated with the scaling activity |

#### ACK Events
| Parameter | Type | Required | Default | Constraints | Description |
|------|------|------|--------|------|------|
| type | string | true | | cs:k8s:K8s-event-via-arms, cs:k8s:PodRelatedEvent | The Kubernetes event type |
| data.reason | string | true | | | The reason for the Kubernetes event |
| data.involvedObject.kind | string | true | | Pod, Node, Service, etc. | The type of Kubernetes object involved |
| data.type | string | false | | Warning, Normal | The event severity |

#### Service Linked Role Management
| Parameter | Type | Required | Default | Constraints | Description |
|------|------|------|--------|------|------|
| ProductName | string | true/false | | | The name of the cloud service or service-linked role |

## Code Examples

### Auto Scaling Event Example - JSON - All Regions

```json
{
  "id": "45ef4dewdwe1-7c35-447a-bd93-fab****",
  "source": "acs.ess",
  "specversion": "1.0",
  "subject": "arn:acs:ess:cn-hangzhou:123456789xxxxxxx:scalinggroup/asg-xxxxx",
  "time": "2020-11-19T21:04:41Z",
  "type": "ess:ScalingActivity:ScaleInError",
  "aliyunaccountid": "123456789098****",
  "aliyunpublishtime": "2020-11-19T21:04:42Z",
  "aliyuneventbusname": "default",
  "aliyunregionid": "cn-hangzhou",
  "aliyunpublishaddr": "172.25.XX.XX",
  "data": {
    "cause": "A user changed the Desired Capacity, changing the Total Capacity from \"1\" to \"0\".",
    "description": "Fail to remove Instances \"i-xxx\" (code:\"LifecycleActionResult.Abandon\", msg:\"Abandon lifecycleActionResult parameter caused the instance to rollback.\").",
    "startTime": "2022-09-06T06:29:23.000Z",
    "endTime": "2022-09-06T06:30:10.000Z",
    "expectNum": 1,
    "requestId": "WOSQ2zMxNTcZOoH1bu****",
    "scalingActivityId": "asa-xxx",
    "totalCapacity": 1
  }
}
```

### BatchCompute Job Success Event - JSON - All Regions

```json
{
    "datacontenttype": "application/json",
    "aliyunaccountid": "123456789098****",
    "aliyunpublishtime": "2021-01-08T15:25:03.083Asia/Shanghai",
    "data": {
        "Status": {
            "EnqueueTime": "2021-02-07T18:16:29.081610969+08:00",
            "State": "Succeeded",
            "CreateTime": "2021-02-07T18:16:29.081610969+08:00"
        },
        "Project": "test48351",
        "OwnerId": "123456789098****",
        "Definition": {
            "MountPoints": [
                {
                    "MountPath": "/home/input/",
                    "Name": "test"
                }
            ],
            "FailStrategy": {
                "WaitingTimeout": 999
            },
            "Type": "Batch",
            "Volumes": [
                {
                    "OSS": {
                        "Bucket": "bcs-test-zb",
                        "Prefix": "blender-demo/scenes/splash279"
                    },
                    "Name": "test"
                }
            ],
            "Command": [
                "python",
                "startclient.py",
                "invoke",
                "{\"action\": \"Convert\", \"parameters\": \"{\\\"widthPixel\\\": \\\"256\\\", \\\"heightPixel\\\": \\\"256\\\", \\\"inputUri\\\": \\\"oss://bcs-test-zb/daemon_app_fast_test/input/sample2.jpg\\\", \\\"outputUri\\\": \\\"oss://bcs-test-zb/daemon_app_fast_test/output/\\\"}\", \"requestid\": \"e419e112-9795-4f6d-a724-438f54ce****\"}"
            ],
            "Envs": {},
            "PackageUri": "oss://bcs-test-zb/daemon_app_fast_test/package/startclient.py",
            "Runtimes": {
                "JobQueue": "cls-0xQ83wAZvGVKCAaTGSU1V69****"
            },
            "Labels": {},
            "Resources": {
                "memory": "1Gi",
                "cpu": "1"
            }
        },
        "JobId": "job-0xQ8VSTpk7HLXyjcdW97EQe****",
        "Name": "e2e test"
    },
    "subject": "acs:batchcompute:cn-hangzhou:123456789098****:test48351/job-0w1JZb8SZ1DEsKbfP99T69t****",
    "specversion": "1.0",
    "aliyuneventbusname": "default",
    "id": "2BF-0w33xE0ZMwZJCh7aat7lEIM****",
    "source": "acs.batchcompute",
    "time": "2021-01-08T15:25:03Z",
    "type": "batchcompute:JobStateChange:JobSucceeded",
    "aliyunpublishaddr": "172.20.XX.XX"
}
```

### ECS Event Publishing - Python - All Regions

```python
import json
import requests

def send_event_to_eventbridge(event):
    url = "https://eventbridge.aliyuncs.com/api/v1/events"
    headers = {
        "Authorization": "Bearer $DASHSCOPE_API_KEY",
        "Content-Type": "application/json"
    }
    response = requests.post(url, headers=headers, data=json.dumps(event))
    if response.status_code == 200:
        print("Event sent successfully")
    else:
        print(f"Failed to send event: {response.text}")

# Example usage
example_event = {
    "id": "a3a1e190-a357-40c5-a1c2-3e343a90****",
    "source": "acs.ecs",
    "specversion": "1.0",
    "subject": "acs:ecs:cn-hangzhou:123456789098****:disk/d-wz9ad6x3sistd7fh****",
    "time": "2021-01-18T16:20:51.199Z",
    "type": "ecs:Disk:OverduePaymentRelease",
    "aliyunaccountid": "123456789098****",
    "aliyunpublishtime": "2021-01-18T08:20:51.890Z",
    "aliyuneventbusname": "default",
    "aliyunregionid": "cn-hangzhou",
    "aliyunpublishaddr": "172.25.XX.XX",
    "data": {
        "result": "accomplished",
        "instanceId": "i-wz9e60ytsp3lspww****",
        "diskId": "d-wz9ad6x3sistd7fh****"
    }
}

send_event_to_eventbridge(example_event)
```

### Cloud Security Scanner Event - Python - All Regions

```python
import boto3

# Create an EventBridge client
client = boto3.client('events', region_name='cn-beijing')

# Send a Cloud Security Scanner event
response = client.put_events(
    Entries=[
        {
            'Source': 'avds.cloudsecurityscanner',
            'DetailType': 'SecurityScanEvent',
            'Detail': '{"eventType": "avds:ActionTrail:ApiCall", "resourceId": "i-1234567890"}',
            'Time': '2023-02-08T21:30:00Z'
        }
    ]
)

print(response)
```

## Response Format

```json
{
    "specversion":"1.0",
    "subject":"",
    "source":"acs.arms",
    "data":{
        "dims":{
            "ip":"10.96.XX.XX"
        },
        "info":{
            "timestamp":"1605854706272"
        }
    },
    "datacontenttype":"application/json",
    "type":"arms:Agent:AgentStart",
    "id":"d3598ce6-09d7-4264-9ba3-5b13c387****",
    "time":"2020-11-19T21:04:41+08:00"
}
```

**Key Fields**:
- `type` — The event type that identifies the specific condition or action
- `data` — The main payload containing event-specific information
- `source` — The service that generated the event
- `specversion` — The CloudEvents specification version
- `id` — Unique identifier for the event
- `time` — Timestamp when the event was generated

## Error Handling

| Error Code (Code) | Description (Description) | Recommended Action (Recommended Action) |
|---------------|--------------------|-----------------------------|
| InvalidParameter | The 'type' parameter contains an invalid or unsupported value. Ensure it matches one of the allowed event types. | Verify the event type against the supported list for the specific service |
| Throttling | Too many requests have been made. Wait before retrying or check your rate limits. | Implement exponential backoff and respect rate limits |
| 400 | Bad Request – The request body is malformed or contains invalid fields. | Validate the request format and ensure all required fields are present |
| 401 | Unauthorized – The API key or authentication token is missing or invalid. | Check your API key and ensure proper authentication headers |
| 403 | Forbidden – The caller does not have permission to publish or consume events. | Verify IAM permissions and service-linked roles |
| 429 | Too Many Requests – Rate limit exceeded; wait before retrying. | Reduce request frequency or implement rate limiting in your application |
| 500 | Internal Server Error – An unexpected error occurred on the server side. | Retry the request after a delay; contact support if the issue persists |

### Rate Limits & Retry
- Standard rate limit: 100 QPS per account for most services
- Some services have different limits (e.g., 10 QPS for certain database services)
- Recommended retry strategy: Exponential backoff with jitter
- Maximum retry attempts: 3-5 attempts before considering the operation failed
- When receiving 429 errors, respect the Retry-After header if provided

## Environment Requirements

- SDK package: `dashscope>=1.14.0` (for services requiring the DashScope SDK)
- Environment variable setup: `export DASHSCOPE_API_KEY=your_key`
- Python version: Python 3.6 or higher recommended
- For SDK compatibility: `boto3` library for EventBridge operations

## FAQ

Q: How do I enable EventBridge monitoring for my Alibaba Cloud services?
A: Most Alibaba Cloud services automatically publish events to EventBridge when you have the service enabled. You need to create EventBridge rules to route these events to your targets (functions, HTTP endpoints, queues, etc.). Some services may require enabling ActionTrail or CloudMonitor integration first.

Q: What authentication is required to receive events from EventBridge?
A: Receiving events from EventBridge typically doesn't require authentication since events are pushed to your configured targets. However, if you're making API calls to EventBridge (like creating rules or checking service-linked roles), you'll need proper authentication with your Alibaba Cloud credentials.

Q: How can I handle high volumes of events without losing any?
A: Use EventBridge's built-in retry mechanism and dead-letter queues (DLQs) for failed deliveries. Configure appropriate retry policies and consider using message queues (like MNS or RocketMQ) as intermediate targets to buffer high event volumes. Also ensure your target endpoints can handle the expected load.

Q: Are there any costs associated with receiving events from EventBridge?
A: EventBridge charges based on the number of events delivered. Most services provide a free tier (typically 1000 events per month), and then charge per event beyond that. The exact pricing varies by service, but generally ranges from 0.0001 to 0.001 CNY per event.

Q: How do I troubleshoot missing events in my EventBridge setup?
A: First verify that the source service is actually generating events by checking its logs or console. Then confirm your EventBridge rule is correctly configured with the right event pattern. Check CloudWatch metrics for your rule to see if events are being matched but failing delivery. Finally, examine your target endpoint logs for any processing errors.

## Pricing & Billing

### Billing Model
Events are billed per request/event delivered to EventBridge. The billing model is pay-per-use with a free tier included.

### Price Reference

| Tier/Model | Input Price | Output Price | Other Fees |
|-----------|---------|---------|---------|
| default | 0.0001 / | 0.0001 / |
| standard | 0.001 / | 0.001 / |
| Direct Mail events | Free | Free | |

### Free Tier
- Most services: 1000 events per month free
- Some services like BaaS and Resource Management: 10,000 events per month free
- CloudConfig: 1 million events per month free
- BatchCompute: Free event publishing

### Usage Limits
- Standard QPS limit: 100 requests per second per account
- Some services have lower limits (e.g., 10 QPS for certain database services)
- Event size limits: Typically 8KB to 10KB maximum per event
- Free tier quotas reset monthly

### Billing Notes
- Events are charged only when successfully delivered to your target
- Failed or rejected events are not billed
- Async processing tasks do not incur additional charges
- Downstream processing costs (e.g., function invocations) are billed separately
- Some services include event publishing in their base service cost with no additional EventBridge charges