Search code examples
joinclickhousematerialized-views

Joining large tables in ClickHouse: out of memory or slow


I have 3 large tables (>100 GB with millions of rows each): events, page_views, and sessions. These tables are connected via 1-n relationships, see table setup below. I'm trying to create a denormalized events_wide table that contains a row for each event, where the corresponding page_views and sessions columns are joined. The idea is to eliminate the joins needed for complex analytics queries, since these joins are slow.

I created a materialized view events_mv which joins the page_views and sessions table to the events table. Whenever a new event is inserted into events, the materialized view should insert a row into events_wide, joining the page_view and session automatically. However, when I insert a single new event, the query either doesn't finish or terminates with an out of memory error.

Even running this simple join query from events to page_views results in an out of memory error: Memory limit (for user) exceeded: would use 99.21 GiB. I use a ClickHouse Cloud production instance with 24+ GB RAM:

SELECT
    -- Select columns from events and page_views
FROM events AS e
LEFT JOIN page_views AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
LIMIT 3;

I tried different primary key orderings for the 3 tables (property_id, created_at, id) vs (property_id, id, created_at), different join algorithms (partial_merge, auto, grace_hash), ANY LEFT JOIN, without success. Maybe using UUIDs instead of numeric IDs is part of the problem, but I can't change the UUIDs unfortunately.

This is my table setup with the (property_id, id, created_at) primary keys:

CREATE TABLE events
(
    id UUID,
    created_at DateTime('UTC'),
    property_id Int,
    page_view_id Nullable(UUID),
    session_id Nullable(UUID),
    ...
) ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);

CREATE TABLE page_views
(
    id UUID,
    created_at DateTime('UTC'),
    modified_at DateTime('UTC'),
    session_id Nullable(UUID),
    ...
) ENGINE = ReplacingMergeTree(modified_at)
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);

CREATE TABLE sessions
(
    id UUID,
    created_at DateTime('UTC'),
    modified_at DateTime('UTC'),
    property_id Int,
    ...
) ENGINE = ReplacingMergeTree(modified_at)
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, id, created_at)
ORDER BY (property_id, id, created_at);


CREATE TABLE events_wide
(
    id UUID,
    created_at DateTime('UTC'),
    property_id Int,
    page_view_id Nullable(UUID),
    session_id Nullable(UUID),
    ...
    -- page_views columns
    p_created_at DateTime('UTC'),
    p_modified_at DateTime('UTC'),
    ...
    -- sessions columns
    s_created_at DateTime('UTC'),
    s_modified_at DateTime('UTC'),
    ...
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(created_at)
PRIMARY KEY (property_id, created_at)
ORDER BY (property_id, created_at, id);


CREATE MATERIALIZED VIEW events_mv TO events_wide AS
SELECT
    e.id AS id,
    e.created_at AS created_at,
    e.session_id AS session_id,
    e.property_id AS property_id,
    e.page_view_id AS page_view_id,
    ...
    -- page_views columns
    p.created_at AS p_created_at,
    p.modified_at AS p_modified_at,
    ...
    -- sessions columns
    s.created_at AS s_created_at,
    s.modified_at AS s_modified_at ,
    ...
FROM events AS e
LEFT JOIN page_views AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
LEFT JOIN sessions AS s ON s.property_id = e.property_id AND s.id = e.session_id
SETTINGS join_algorithm = 'partial_merge';

Solution

  • ClickHouse doesn't have a proper optimizer, so the right tables of the join require to be filtered before performing a join. Otherwise, full tables will be pushed to memory to perform the join causing the issues you're experiencing.

    Using the example you've provided:

    WITH events_block AS (
        SELECT * FROM events LIMIT 3
    )
    SELECT e.*, p.* FROM events_block AS e
    LEFT JOIN (
        SELECT * FROM page_views
        WHERE (property_id, id) IN (
            SELECT property_id, page_view_id FROM events_block
        )
    ) AS p ON p.property_id = e.property_id AND p.id = e.page_view_id;
    

    This could seem weird if you think about single join operation but materialized views are processed in blocks, this will prevent moving to memory full right tables every single time.

    So rewriting the materialized view as follows will do the trick:

    CREATE MATERIALIZED VIEW events_mv TO events_wide AS
    SELECT
        e.id AS id,
        e.created_at AS created_at,
        e.session_id AS session_id,
        e.property_id AS property_id,
        e.page_view_id AS page_view_id,
        ...
        -- page_views columns
        p.created_at AS p_created_at,
        p.modified_at AS p_modified_at,
        ...
        -- sessions columns
        s.created_at AS s_created_at,
        s.modified_at AS s_modified_at,
        ...
    FROM events AS e
    LEFT JOIN (
        SELECT * FROM page_views
        WHERE (property_id, id) IN (
            SELECT property_id, page_view_id FROM events
        )
    ) AS p ON p.property_id = e.property_id AND p.id = e.page_view_id
    LEFT JOIN (
        SELECT * FROM sessions
        WHERE (property_id, id) IN (
            SELECT property_id, session_id FROM events
        )
    ) AS s ON s.property_id = e.property_id AND s.id = e.session_id