Search code examples
postgresqldatabase-designherokustorage

How many records can I store in 5 MB of PostgreSQL on Heroku?


I'm going to store records in a single table with 2 fields:

  • id -> 4 characters

  • password_hash -> 64 characters

How many records like the one above will I be able to store in a 5mb PostgreSQL on Heroku?

P.S.: given a single table with x columns and a length of y - how can I calculate the space it will take in a database?


Solution

  • Disk space occupied

    Calculating the space on disk is not trivial. You have to take into account:

    • The overhead per table. Small, basically the entries in the system catalog.

    • The overhead per row (HeapTupleHeader) and per data page (PageHeaderData). Details about page layout in the manual.

    • Space lost to column alignment, depending on data types.

    • Space for a NULL bitmap. Effectively free for tables of 8 columns or less, irrelevant for your case.

    • Dead rows after UPDATE / DELETE. (Until the space is eventually vacuumed and reused.)

    • Size of index(es). You'll have a primary key, right? Index size is similar to that of a table with just the indexed columns and less overhead per row.

    • The actual space requirement of the data, depending on respective data types. Details for character types (incl. fixed length types) in the manual:

      The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, which includes the space padding in the case of character. Longer strings have 4 bytes of overhead instead of 1

      More details for all types in the system catalog pg_type.

    • The database encoding in particular for character types. UTF-8 uses up to four bytes to store one character (But 7-Bit-ASCII characters always occupy just one byte, even in UTF-8.)

    • Other small things that may affect your case, like TOAST - which should not affect you with 64-character strings.

    Calculate with test case

    A simple method to find an estimate is to create a test table, fill it with dummy data and measure with database object size functions::

    SELECT pg_size_pretty(pg_relation_size('tbl'));
    

    Including indexes:

    SELECT pg_size_pretty(pg_total_relation_size('tbl'));
    

    See:

    A quick test shows the following results:

    CREATE TABLE test(a text, b text);
    INSERT INTO test -- quick fake of matching rows
    SELECT chr((g/1000 +32)) || to_char(g%1000, 'FM000')
         , repeat (chr(g%120 + 32), 64)
    FROM   generate_series(1,50000) g;
    
    SELECT pg_size_pretty(pg_relation_size('test'));       -- 5640 kB
    SELECT pg_size_pretty(pg_total_relation_size('test')); -- 5648 kB
    

    After adding a primary key:

    ALTER TABLE test ADD CONSTRAINT test_pkey PRIMARY KEY(a);
    
    SELECT pg_size_pretty(pg_total_relation_size('test')); -- 6760 kB
    

    So, I'd expect a maximum of around 44k rows without and around 36k rows with primary key.