Search code examples
sqliteblob

How to unnest an integer array represented as a BLOB?


SQLite doesn’t have native support for arrays. I would think that my method (thinking outlined below) of making a custom BLOB encoding would be a fairly common workaround (yes, I need an array rather than normalizing the table). The benefit of representing an integer array as a BLOB is primarily the space savings, for example:

13,24,455,23,64789

Stored as TEXT will take up 18 bytes (commas included, making assumptions here). But if one were to store the above TEXT in a custom encoded BLOB format it would look like this:

0x000D001801C7FD15

Where every number is assumed to take up 2 Bytes (0 to 65,535 or 0000 to FFFF). This BLOB, to my understanding, would then be 10 Bytes. This is nearly half the size as storing it in a delimited TEXT format. This space savings would also be magnified by the number of rows and the number of integers in the array.

Is there a good way of unnesting a BLOB by width? Say that I want to unnest the BLOB so that each row represents an integer. Can I take the above BLOB and turn it into this?

id number
1 000D
2 0018
3 01C7
4 FD15

Solution

  • SQLite's HEX() function

    interprets its argument as a BLOB and returns a string which is the upper-case hexadecimal rendering of the content of that blob.

    After you get the blob as a string use SUBSTR() to slice it in 4 char parts:

    WITH cte(id) AS (VALUES (1), (2), (3), (4), (5))
    SELECT HEX(t.col) hex_col, 
           c.id,
           SUBSTR(HEX(t.col), (c.id - 1) * 4 + 1, 4) number
    FROM tablename t JOIN cte c
    ORDER BY t.rowid, c.id;
    

    Or, with a recursive CTE to generate dynamically the list of values 1..n:

    WITH cte(id) AS (
      SELECT 1 
      UNION ALL
      SELECT id + 1
      FROM cte
      WHERE id < 5
    )
    SELECT HEX(t.col) hex_col, 
           c.id,
           SUBSTR(HEX(t.col), (c.id - 1) * 4 + 1, 4) number
    FROM tablename t JOIN cte c
    ORDER BY t.rowid, c.id;
    

    Change col to the name of your blob column.

    See the demo.