2026-02-23 · 3 min read

ClickHouse Merges: How They Work and Why They Matter

Every INSERT into a ClickHouse MergeTree table creates at least one data part. Background merges consolidate these parts over time. Without merges, part count grows unboundedly until query performance degrades and inserts eventually fail. Understanding the merge lifecycle is fundamental to operating ClickHouse in production.

The Part Lifecycle

INSERT → New part created → Part enters merge queue
    → Background merge combines parts
    → Old parts marked inactive → Deleted after TTL

Each INSERT creates 1 part (per partition). ClickHouse immediately queues smaller parts for merging.

Monitoring Merge Activity

-- Currently active merges
SELECT database, table, elapsed,
    round(progress * 100, 1) AS pct_complete,
    formatReadableSize(total_size_bytes_compressed) AS total_size,
    result_part_name
FROM system.merges
ORDER BY elapsed DESC;
 
-- Merge throughput over time
SELECT toStartOfMinute(event_time) AS minute,
    countIf(event_type = 'MergeParts') AS merges_completed,
    sum(rows) AS rows_merged
FROM system.part_log
WHERE event_type = 'MergeParts'
  AND event_time >= now() - INTERVAL 1 HOUR
GROUP BY minute
ORDER BY minute DESC;

Controlling Merge Behavior

-- Check merge settings
SELECT name, value FROM system.settings
WHERE name IN (
    'background_pool_size',
    'merge_tree_max_rows_to_use_cache',
    'min_compress_block_size'
);
 
-- Force a merge (use carefully on large tables)
OPTIMIZE TABLE database.table_name;
 
-- Force complete merge into single part
OPTIMIZE TABLE database.table_name FINAL;

Why Too Many Parts Is Dangerous

ClickHouse will warn at 300 parts per partition and slow down inserts at 1000. Each query must open file handles for every active part in a table — too many parts increases query latency and memory usage significantly.

Keep part count below 100 per partition as a general target.

Clustersight monitors part count and merge queue depth, alerting you before part accumulation becomes a problem.

Read more: ClickHouse Merge Queue Guide | ClickHouse Monitoring Guide

Frequently Asked Questions

What are ClickHouse merges?

Merges are background operations where ClickHouse combines multiple smaller data parts into larger ones. This improves query performance (fewer parts to scan), reduces storage (compression improves with larger parts), and prevents too-many-parts errors.

How does ClickHouse decide which parts to merge?

ClickHouse uses an internal algorithm that considers part size, age, and count to select merge candidates. Older, smaller parts are prioritized. The merge_tree_max_rows_to_use_cache and min_bytes_to_rebalance_partition_over_jbod settings influence this behavior.

Can I force a merge in ClickHouse?

Yes. Run OPTIMIZE TABLE database.table FINAL to force a full merge of all parts into one. This is expensive for large tables but useful for small tables or post-migration cleanup.