Search code examples
sqlsqlitevirtual-table

Create virtual table with rowid only of another table


Suppose I have a table in sqlite as follows:

`name`     `age`
"bob"      20      (rowid=1)
"tom"      30      (rowid=2)      
"alice"    19      (rowid=3)

And I want to store the result of the following table using minimal storage space:

SELECT * FROM mytable WHERE name < 'm' ORDER BY age

How can I store a virtual table from this resultset that will just give me the ordered resultset. In other words, storing the rowid in an ordered way (in the above it would be 3,1) without saving all the data into a separate table.

For example, if I stored this information with just the rowid in a sorted order:

CREATE TABLE vtable AS 
SELECT rowid from mytable WHERE name < 'm' ORDER BY age;

Then I believe every time I would need to query the vtable I would have to join it back to the original table using the rowid. Is there a way to do this so that the vtable "knows" the content that it has based on the external table (I believe this is referred to as external-content when creating an fts index -- https://sqlite.org/fts5.html#external_content_tables).


Solution

  • I believe this is referred to as external-content when creating an fts.

    No a virtual table is CREATED using CREATE VIRTUAL TABLE ...... USING module_name (module_parameters)

    Virtual tables are tables that can call a module, thus the USING module_name(module_parameters) is mandatory.

    For FTS (Full Text Serach) you would have to read the documentation but it could be something like

    CREATE VIRTUAL TABLE IF NOT EXISTS bible_fts USING FTS3(book, chapter INTEGER, verse INTEGER, content TEXT)
    

    You very likely don't need/want a VIRTUAL table.


    CREATE TABLE vtable AS SELECT rowid from mytable WHERE name < 'm' ORDER BY age;
    

    Would create a normal table IF it didn't already exist that would persist. And if you wanted to use it then it would probably only be of use by joining it with mytable. Effectively it would allow a snapshot, but at a cost, of at least 4k for every snapshot.

    I'd suggest a single table for all snapshots that has two columns a snapshot identifier and the rowid of the snapshot. This would probably be far less space consuming.

    Basic Example

    Consider :-

    CREATE TABLE IF NOT EXISTS mytable (
        id INTEGER PRIMARY KEY, /* NOTE not using an alias of the rowid may present issues as the id's can change */
        name TEXT,
        age INTEGER
    );
    CREATE TABLE IF NOT EXISTS snapshot (id TEXT DEFAULT CURRENT_TIMESTAMP, mytable_map);
    
    INSERT INTO mytable (name,age) VALUES('Mary',21),('George',22);
    
    INSERT INTO snapshot (mytable_map) SELECT id FROM mytable;
    
    SELECT snapshot.id,name,age FROM snapshot JOIN mytable ON mytable.id = snapshot.mytable_map;
    

    And the above is run 3 times with a reasonable interval (seconds so as to distinguish the snapshot id (the timestamp)).

    Then you would get 3 snapshots (each with a number of rows but the same value in the id column for each snapshot), the first with 2 rows, the 2nd with 4 and the last with 6 (as each run 2 rows are being added to mytable) :-

    enter image description here