# opensearch-search

Part of **OPENSEARCH**

# OpenSearch Search Troubleshooting Guide

## Problem Index

| Problem | Symptom | Severity | Solution Summary |
|--------|---------|----------|------------------|
| Connection Pool Timeout in Java SDK | `ConnectionPoolTimeoutException` thrown under high concurrency | High | Increase max connections via `HttpClientManager.setMaxConnections()` |
| Pagination Limit Exceeded | Error code `6013`: "The sum of start and hit exceeds 5000" | Medium | Use scroll search for deep pagination; ensure `start + hit ≤ 5000` |
| Invalid Signature or Authentication | Error code `4003` or `SignatureDoesNotMatch` | High | Verify AccessKey, parameter sorting, and URL encoding per auth docs |
| Rate Limit or Quota Exceeded | Error codes `3007`, `6015`, or `Throttling.User` | Medium | Reduce request rate or increase LCU quota in console |
| Schema Field Misconfiguration | Error code `6127`: referenced field not configured as attribute | Medium | Update schema offline to add required attribute fields |

## Problem Details

### Problem 1: Connection Pool Timeout in OpenSearch SDK for Java

**Symptoms**
- Error message: `ConnectionPoolTimeoutException`
- Behavior: Requests fail under high concurrency despite healthy backend
- Context: Occurs when using OpenSearch SDK for Java with >50 concurrent requests

**Root Cause**
The default connection pool capacity in the OpenSearch SDK for Java is 50 connections. When concurrent requests exceed this limit, the SDK cannot allocate a new HTTP connection within the timeout window, resulting in a `ConnectionPoolTimeoutException`.

**Solution**
1. Increase the maximum number of connections in the HTTP client manager:
```java
import com.aliyun.opensearch.util.HttpClientManager;

// Set max connections to 100 (adjust based on workload)
HttpClientManager.setMaxConnections(100);
```
2. Ensure this configuration is applied before making any API calls.

**Verification**
- After applying the fix, run a load test with >50 concurrent requests.
- Confirm no `ConnectionPoolTimeoutException` is thrown.
- Monitor application logs for successful request completion.

### Problem 2: Pagination Limit Exceeded (Error Code 6013)

**Symptoms**
- Error message: `6013` — "The sum of start and hit exceeds 5000. No results are returned."
- Behavior: Search returns empty result set when requesting deep pages (e.g., `start=4900`, `hit=200`)
- Context: Common during batch processing or UI pagination beyond first 5,000 results

**Root Cause**
OpenSearch enforces a hard limit of 5,000 documents for standard pagination (`start + hit ≤ 5000`). This prevents excessive memory usage and degraded performance from deep pagination.

**Solution**
1. For result sets ≤5,000 documents: adjust parameters so that `start + hit ≤ 5000`.
2. For larger result sets, use scroll search:
   - Set `search_type=scan` in the initial request
   - Use the `scroll` parameter (e.g., `scroll=2m`)
   - Retrieve subsequent batches using the `_scroll_id`
```bash
# Initial scroll request
curl -X GET "https://<api-endpoint>/v3/openapi/apps/<app_name>/actions/search?search_type=scan&scroll=2m" \
  -H "Content-Type: application/json" \
  -d '{"query":"...", "hit":100}'
```
3. Refer to the OpenSearch API reference for full scroll search requirements.

**Verification**
- For standard pagination: confirm `start + hit = 5000` returns results.
- For scroll: verify `_scroll_id` is returned and subsequent requests yield additional results.

### Problem 3: Invalid Signature or Authentication Failure

**Symptoms**
- Error message: `4003` — "Signature verification failed" or `SignatureDoesNotMatch`
- Behavior: API returns 4xx error even with seemingly correct credentials
- Context: Occurs during signed API requests using AccessKey pairs

**Root Cause**
The request signature does not match the expected value due to:
- Incorrect AccessKey ID or secret
- Improper sorting of request parameters (common parameters in uppercase must precede request parameters in lowercase, both alphabetically sorted)
- Incorrect URL encoding (e.g., spaces not encoded as `%20`)

**Solution**
1. Validate your AccessKey pair and API endpoint:
   - Log in to OpenSearch console → Instance Management → click **Details** → verify **API Endpoint**
2. Regenerate the signature using the official sample document to confirm correctness
3. Ensure parameter sorting and encoding follow authorization documentation:
   - Sort all parameters by name (case-sensitive)
   - Uppercase-named parameters (e.g., `AccessKeyId`, `SignatureMethod`) come before lowercase ones
   - Encode spaces as `%20`, not `+`

**Verification**
- Use the same signing logic on a known-working sample request
- Compare generated signature with expected value
- Successful request returns `200 OK` with valid search results

### Problem 4: Rate Limit or Quota Exceeded

**Symptoms**
- Error messages: 
  - `3007`: "API operations for pushing data are called too frequently"
  - `6015`: "LCUs of computing resources exceed the purchased quota"
  - `Throttling.User`: "Too many requests were made in a short time"
- Behavior: Intermittent 429 or 5xx errors during bursts of activity
- Context: High-throughput indexing or querying workloads

**Root Cause**
OpenSearch enforces quotas on:
- Data push API call frequency (error `3007`)
- Logical Computing Unit (LCU) consumption per second (error `6015`)
- General request rate (error `Throttling.User`)

**Solution**
1. For data push rate limits (`3007`): reduce the frequency of push API calls; batch data where possible.
2. For LCU quota exceeded (`6015`):
   - Log in to OpenSearch console → Instance Management
   - Find your application → click **More** → **Change Specifications/Quotas**
   - Increase the LCU quota
3. For general throttling (`Throttling.User`): implement exponential backoff in client code.

**Verification**
- After reducing request rate: errors cease and responses return `200`
- After quota increase: monitor LCU usage in console; confirm requests succeed during peak load

### Problem 5: Schema Field Misconfiguration (Error Code 6127)

**Symptoms**
- Error message: `6127` — "The fields referenced in clauses other than query clauses must be configured as attribute fields"
- Behavior: Search fails when using fields in sort, summary, or filter clauses not marked as attributes
- Context: Occurs after schema changes or when using advanced query features

**Root Cause**
OpenSearch requires fields used in non-query clauses (e.g., `sort`, `summary`, `distinct`) to be explicitly defined as **attribute fields** in the application schema. If omitted, the system cannot process the request.

**Solution**
1. Modify the application schema offline:
   - In OpenSearch console, go to your application → **Schema Management**
   - Add the referenced field(s) with **Attribute** enabled
2. Deploy the schema change (offline update may take minutes)
3. Retry the search request

**Verification**
- After schema update, the same search request returns `200 OK`
- Results include expected fields in sort order or summary

## FAQ

**Q: How do I check if my OpenSearch service is healthy?**  
A: Send a simple search request to your application endpoint. A `200 OK` response with valid results indicates health. Also check the OpenSearch console for instance status and LCU usage metrics.

**Q: What permissions are needed to call OpenSearch APIs?**  
A: Your RAM user must have permissions to access OpenSearch resources. Assign policies like `AliyunOpenSearchFullAccess` or custom policies granting `opensearch:*` actions on your instances.

**Q: How do I enable debug logging for OpenSearch SDK for Java?**  
A: Configure your logging framework (e.g., Log4j, SLF4J) to set the log level for `com.aliyun.opensearch` to `DEBUG`. This will output request/response details and connection pool status.

**Q: What causes timeout errors (code 1000), and how can I avoid them?**  
A: Error `1000` indicates a request timeout, often due to large result sets, complex queries, or high system load. Optimize queries (reduce `hit`, simplify expressions), increase client-side timeout settings, or scale up LCU quota. Retry with exponential backoff.

**Q: How do I handle JSON parsing errors (code 4007)?**  
A: Error `4007` occurs when the request body contains unescaped double quotes (`"`) or non-printable characters. Escape special characters in JSON strings (e.g., `\"`), validate JSON structure before sending, or add input sanitization filters.