Skip to content

Everything you need to monitor ClickHouse.

Purpose-built for ClickHouse — not a generic monitoring tool bolted on to support one more database.

Get Started Free

Health Score (0–100)

A composite score across 5 weighted components — replication, storage, errors, infrastructure, and queries. 90+ is healthy. Below 50 is critical.

Fix Commands with Every Alert

Every alert ships with a copy-pasteable SQL fix command. No more Googling. No more guessing. Run the SQL and move on.

400+ Hidden Metrics

Surface the operational metrics standard tools miss: broken parts, mutation backlogs, ZooKeeper lag, replication status, and more — across 18+ system tables.

Slack Alerts

Alerts go straight to your on-call Slack channel with context and the fix command. Your team knows before your users do.

Query Inspector

Identify slow queries by duration, memory usage, and read bytes. Drill into failed queries by error type and analyze parts size distribution.

Multi-Cluster Support

Monitor multiple ClickHouse clusters from a single ClusterSight instance. Cluster overview page, one-click switching, and per-cluster dashboards.

Password Protection

Secure your ClusterSight instance with a password gate. Safe to expose beyond localhost without putting your monitoring data at risk.

Deploy in Under 8 Minutes

Docker Compose deployment. Connect your ClickHouse cluster with a read-only user. Dashboard ready in under 8 minutes.

Smart alerts with root cause context and SQL fix commands

Fix Commands with Every Alert

Every alert includes root cause context and a copy-pasteable SQL fix command.

Query inspector with slow queries, failed query breakdown, and parts distribution

Query Inspector

Slow queries with P50/P95/P99 percentiles, failed query breakdown, and parts distribution.

11 Pre-Built Dashboard Panels

Every panel reads directly from ClickHouse system tables — no agents, no exporters.

🎯
Cluster Health Score

Composite 0–100 score across replication, storage, errors, infrastructure, and queries. A+ (≥95) to F (<40).

🔄
Replication Status

Per-table replication delay, queue size, and read-only replica detection from system.replicas.

📊
Merge Queue Depth

Active merges, throughput rate, and backlog estimate from system.merges and system.parts.

💾
Disk Usage

Per-disk free space and usage percentage from system.disks. Alerts when any disk exceeds 85%.

⚙️
Mutation Queue

Active and stuck mutations (>1hr) from system.mutations. Stuck mutations block merges and degrade performance.

🔴
Broken Parts

Detached and broken part counts from system.detached_parts. Any broken part is a critical finding.

🔗
Keeper Connection Health

Overall ZooKeeper/Keeper connectivity status. Losing quorum halts all replicated operations.

🖧
Keeper Nodes (per-node TCP)

Individual TCP reachability check for each Keeper node. Detects split-brain and ensemble gaps.

🌿
ZooKeeper Session Health

Session count, ephemeral nodes, and watch count. High watch counts indicate potential memory pressure.

📦
Compression Ratios

Per-table compression ratios from system.parts_columns. Flags tables with ratio below 2.0× as under-compressed.

📈
Error Rate Trending

Time-series error rates from system.errors with severity classification and 5× rolling average spike detection.

Ready to get started?

Free for one ClickHouse cluster. Deploy in under 8 minutes.