API Reference
ClusterSight exposes a REST API at /api/v1/. All responses use the ApiResponse envelope unless noted.
Authentication
When CLUSTERSIGHT_PASSWORD is set, all endpoints (except /api/v1/health) require an Authorization header.
Token derivation
The bearer token is the lowercase hex SHA-256 of the plaintext password:
# Derive the token from your password
echo -n "your_password" | sha256sum | cut -d' ' -f1
# Example output: 5e884898da28047151d0e56f8dc629277...Making authenticated requests
curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/clustersDev mode (no authentication)
When CLUSTERSIGHT_PASSWORD is not set, no Authorization header is required. A warning is logged on startup (auth_mode: "dev_mode_no_auth").
Base URL & Versioning
All endpoints are prefixed with /api/v1/.
Base URL (default Docker mapping): http://localhost:3001/api/v1
Response envelope
All endpoints return the ApiResponse envelope:
{
"data": <payload>,
"meta": { "cache_hit": false },
"error": null
}| Field | Type | Description |
|---|---|---|
data | any | Response payload (object, array, or null) |
meta | object | Metadata — may include cache_hit: true for cached panel responses |
error | object | null | Error details on failure; null on success |
Endpoints Overview
| Method | Path | Description | Auth |
|---|---|---|---|
GET | /health | Health check + version | No |
GET | /overview | All clusters with health + alert summary | Yes |
GET | /clusters | List configured clusters | Yes |
POST | /clusters | Add a new cluster | Yes |
POST | /clusters/test | Test connection & discover topology | Yes |
GET | /clusters/{id} | Get a single cluster | Yes |
PUT | /clusters/{id} | Update cluster configuration | Yes |
DELETE | /clusters/{id} | Remove a cluster | Yes |
GET | /panels/health-score | Cluster health score + components | Yes |
GET | /panels/replication | Replication lag time-series | Yes |
GET | /panels/merges | Active merges time-series | Yes |
GET | /panels/disk | Disk usage per disk | Yes |
GET | /panels/mutations | Active mutations | Yes |
GET | /panels/broken-parts | Broken/detached parts count | Yes |
GET | /panels/keeper | Keeper connection health | Yes |
GET | /panels/keeper-nodes | Per-node Keeper TCP status | Yes |
GET | /panels/zookeeper | ZooKeeper session health | Yes |
GET | /panels/compression | Per-table compression ratios | Yes |
GET | /panels/errors | Error event time-series | Yes |
GET | /alerts/rules/metrics | Valid metric keys for custom rules | Yes |
GET | /alerts/rules | List alert rules | Yes |
POST | /alerts/rules | Create a custom alert rule | Yes |
PUT | /alerts/rules/{id} | Update an alert rule | Yes |
DELETE | /alerts/rules/{id} | Delete a custom alert rule | Yes |
GET | /alerts/history | Alert history with filters | Yes |
POST | /alerts/{id}/acknowledge | Acknowledge an alert | Yes |
POST | /alerts/{id}/snooze | Snooze an alert | Yes |
POST | /alerts/{id}/escalate | Escalate alert to Slack | Yes |
GET | /alerts/summary | Unresolved alert count | Yes |
GET | /alerts/{id} | Single alert detail | Yes |
GET | /queries/slow | Slow queries analysis | Yes |
POST | /queries/explain | Fetch EXPLAIN for a query | Yes |
GET | /queries/failed | Failed queries breakdown | Yes |
GET | /queries/parts-distribution | Parts size distribution | Yes |
GET | /settings | Application settings | Yes |
PUT | /settings | Update application settings | Yes |
POST | /settings/test-notification | Send a test Slack notification | Yes |
Health
GET /api/v1/health
Health check. Does not require authentication. Safe to use as a liveness probe.
Response:
{
"data": { "version": "0.1.0" },
"meta": {},
"error": null
}curl:
curl http://localhost:3001/api/v1/healthCluster Overview
GET /api/v1/overview
Returns a summary of all active clusters: health score, alert counts, collector status. Powers the Cluster Overview page.
Response: Array of ClusterOverviewItem inside data.
{
"data": [
{
"id": "abc123",
"name": "production",
"host": "clickhouse.internal",
"port": 8123,
"health_score": 87.4,
"grade": "B+",
"active_alert_count": 1,
"collector_status": "online",
"last_collected_at": "2026-03-22T14:00:00Z",
"logical_cluster_count": 3
}
],
"meta": {},
"error": null
}Clusters
GET /api/v1/clusters
List all configured clusters.
curl:
curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/clustersResponse: Array of ClusterResponse objects.
POST /api/v1/clusters
Add a new cluster.
Request body:
{
"host": "clickhouse.internal",
"port": 8123,
"username": "clustersight_ro",
"password": "readonly_password",
"name": "production"
}Response: ClusterResponse with assigned id.
POST /api/v1/clusters/test
Test a connection before saving. Returns topology information (discovered clusters, Keeper nodes).
Request body: Same as POST /clusters.
Response:
{
"data": {
"success": true,
"message": "Connected successfully",
"clusters": ["cluster1", "cluster2"],
"keeper_nodes": ["keeper1:9181", "keeper2:9181"]
},
"meta": {},
"error": null
}GET /api/v1/clusters/
Get a single cluster by ID.
PUT /api/v1/clusters/
Update cluster configuration (host, port, credentials, name).
DELETE /api/v1/clusters/
Soft-delete a cluster. The cluster is marked inactive; its data is retained.
Response:
{ "data": { "id": "abc123", "deleted": true }, "meta": {}, "error": null }Dashboard Panels
Each panel is a dedicated endpoint. Panel data is cached for CACHE_TTL seconds (default: 30).
GET /api/v1/panels/health-score
Returns the current health score, grade, trend, and per-component breakdown.
Response:
{
"data": {
"overall_score": 87.4,
"grade": "B+",
"trend": "stable",
"previous_score": 85.0,
"components": {
"replication": { "score": 100.0, "label": "Replication" },
"storage": { "score": 82.0, "label": "Storage" },
"errors": { "score": 95.0, "label": "Errors" },
"infrastructure": { "score": 80.0, "label": "Infrastructure" },
"queries": { "score": 70.0, "label": "Queries" }
},
"calculated_at": "2026-03-22T14:00:00Z"
},
"meta": { "cache_hit": false },
"error": null
}No
?rangeparameter — returns the most recently computed score.
GET /api/v1/panels/replication
Query params: ?range=1h|6h|24h|7d (default: 1h)
Returns replication lag time-series per replicated table.
GET /api/v1/panels/merges
Query params: ?range=1h|6h|24h|7d (default: 1h)
Returns active merge counts and estimated completion times.
GET /api/v1/panels/disk
No range param. Returns current disk usage per disk path (used bytes, free bytes, usage %).
GET /api/v1/panels/mutations
No range param. Returns active mutations with estimated parts remaining.
GET /api/v1/panels/broken-parts
No range param. Returns count of detached/broken parts per table.
GET /api/v1/panels/keeper
No range param. Returns Keeper connection status (ONLINE / OFFLINE).
GET /api/v1/panels/keeper-nodes
No range param. Returns per-node TCP health status (status: 1 = reachable, 0 = unreachable).
GET /api/v1/panels/zookeeper
No range param. Returns ZooKeeper/Keeper session health metrics.
GET /api/v1/panels/compression
No range param. Returns per-table compression ratios (compressed / uncompressed bytes).
GET /api/v1/panels/errors
Query params: ?range=<seconds> (integer, 60–604800, default: 3600)
Note: Unlike
replicationandmerges, theerrorspanel uses an integer seconds range, not the1h|6h|24h|7denum.
Returns error event count time-series from system.errors.
curl:
# Last 6 hours (21600 seconds)
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/v1/panels/errors?range=21600"Alerts
GET /api/v1/alerts/rules
List all alert rules (built-in and custom).
curl:
curl -H "Authorization: Bearer <token>" http://localhost:3001/api/v1/alerts/rulesPOST /api/v1/alerts/rules
Create a custom alert rule.
Request body:
{
"name": "My Custom Rule",
"metric_key": "disk.usage_pct",
"operator": ">",
"threshold": 70.0,
"severity": "warning",
"cooldown_sec": 3600
}Get valid metric keys: GET /api/v1/alerts/rules/metrics
PUT /api/v1/alerts/rules/
Update an existing rule. Built-in rules can have their threshold, severity, and cooldown updated but cannot be deleted.
DELETE /api/v1/alerts/rules/
Delete a custom rule. Returns {"id": "...", "deleted": true}. Built-in rules (rule_type: "built_in") cannot be deleted.
GET /api/v1/alerts/history
List alert history with optional filters.
Query params:
| Param | Type | Default | Description |
|---|---|---|---|
severity | string | (all) | Filter by warning or critical |
status | string | (all) | Filter by active, acknowledged, or resolved |
hours | int (1–168) | 24 | Lookback window in hours |
limit | int (1–200) | 50 | Maximum results to return |
offset | int (≥ 0) | 0 | Pagination offset |
curl:
# Active critical alerts in the last 24 hours
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/v1/alerts/history?status=active&severity=critical&hours=24"POST /api/v1/alerts//acknowledge
Mark an alert as acknowledged. Sets status to acknowledged.
POST /api/v1/alerts//snooze
Snooze an alert for a specified duration.
Request body:
{ "duration_minutes": 60 }Response: { "snoozed_until": "<ISO timestamp>", "rule_id": "..." }
POST /api/v1/alerts//escalate
Escalate an alert to Slack immediately, regardless of notification cooldown.
GET /api/v1/alerts/summary
Returns the count of currently unresolved alerts.
Response:
{ "data": { "count": 3 }, "meta": {}, "error": null }GET /api/v1/alerts/
Get full detail for a single alert, including metric history and fix command.
Query Inspector
GET /api/v1/queries/slow
Returns the slowest queries from system.query_log.
Query params: ?hours=<1–168> (default: 24), ?limit=<1–200> (default: 50)
curl:
# Top 20 slowest queries in the last 6 hours
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/v1/queries/slow?hours=6&limit=20"POST /api/v1/queries/explain
Fetch the EXPLAIN output for a specific query by its hash.
Request body:
{ "query_hash": "abc123def456..." }GET /api/v1/queries/failed
Returns failed queries grouped by error type.
Query params: ?hours=<1–168> (default: 24), ?limit=<1–200> (default: 50)
GET /api/v1/queries/parts-distribution
Returns parts size distribution across tables.
Query params: ?limit=<1–500> (default: 100)
Settings
GET /api/v1/settings
Get current application settings. The slack_webhook_url field is masked in the response.
PUT /api/v1/settings
Update application settings.
Request body (all fields optional):
{
"slack_webhook_url": "https://hooks.slack.com/services/...",
"collection_interval_sec": 30,
"tier1_enabled": true,
"tier2_enabled": true,
"retention_raw_days": 7,
"retention_hourly_days": 30,
"retention_daily_days": 365
}POST /api/v1/settings/test-notification
Send a test Slack notification using the configured webhook URL.
Response: { "data": { "sent": true }, "meta": {}, "error": null }
Error Codes
| HTTP Status | Meaning in ClusterSight |
|---|---|
401 Unauthorized | Missing or invalid Authorization header. Provide Bearer <sha256-token> or disable password protection. |
404 Not Found | The requested resource (cluster, alert, rule) does not exist or has been deleted. |
409 Conflict | Duplicate resource — e.g., a cluster with the same host:port already exists. |
422 Unprocessable Entity | Validation error. Response body contains field-level details: { "error": { "detail": [...] } } |
503 Service Unavailable | ClickHouse is unreachable. The panel or query endpoint could not connect to the cluster. Verify ClickHouse is running and accessible. |
curl Examples
List all clusters
TOKEN=$(echo -n "your_password" | sha256sum | cut -d' ' -f1)
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:3001/api/v1/clustersGet health score
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:3001/api/v1/panels/health-scoreList active alerts (last 24 hours)
curl -H "Authorization: Bearer $TOKEN" \
"http://localhost:3001/api/v1/alerts/history?status=active&hours=24"Add a cluster
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"host":"clickhouse.internal","port":8123,"username":"clustersight_ro","password":"pw","name":"prod"}' \
http://localhost:3001/api/v1/clustersAcknowledge an alert
curl -X POST \
-H "Authorization: Bearer $TOKEN" \
http://localhost:3001/api/v1/alerts/<alert-id>/acknowledge