Search code examples
sqlsqlite

Overlay table mechanism in SQLite


I wonder if there is a mechanism for tables in SQLite similar to the overlayfs file system.

That means I have some table with pre filled data (the base). It is read only. Then I have a second read write table with same table schema (the overlay). I want to create a (virtual?) table that presents itself as if it contained the data merged from both tables, but with the overlay taking priority. In the sense that if the overlay contains same primary key as the base table the data from overlay is retrieved, and in the other case the base table data is taken. Writes or updates would always go to the overlay.

Is this possible?

I found the virtual union table extension, but that apparently only works for tables with mutually different primary keys.


Solution

  • You achieve this using window functions and UNION ALL operator.

    First, let's start with two tables: base and derived.

    CREATE TABLE base    (a INT PRIMARY KEY, b TEXT);
    CREATE TABLE derived (a INT PRIMARY KEY, b TEXT);
    

    Then, add some data.

    INSERT INTO base    (a, b) VALUES (1, 'one'), (2, 'two'), (3, 'three');
    INSERT INTO derived (a, b) VALUES (1, 'uno'), (2, 'duo'), (4, 'cuatro');
    

    Next, we can create an overlay table as:

    SELECT a, b
    FROM (
        SELECT *, ROW_NUMBER() OVER (
            PARTITION BY a 
            ORDER BY priority DESC
        ) AS rank
        FROM (
            SELECT a, b, 1 AS priority FROM derived
            UNION ALL
            SELECT a, b, 0 AS priority FROM base
        ) everything
    ) overlaid
    WHERE rank = 1;
    

    To make it easy to use, you can convert it into a VIEW as:

    CREATE VIEW overlay AS (); -- the query from above
    

    Explanation

    1. We combine base and derived into everything and assign priority (higher to derived and lower to base)

    2. We assign rank to sub-groups identified by our primary-key a based on priority. If there's a row in both derived and base, the one in derived will receive rank = 1.

    3. We pick only rows with rank = 1

    This solution works well as if a row exists in both tables, the one from derived replaces base, else if a row is missing from derived, it stays from base. Lastly, a row that exists only in derived gets included.