Search code examples
postgresqlhstore

Should I use hstore in renewed data model?


In my legacy database (Postgres 9.1) I have several tables containing divers kinds of documents (let's say they are 'parent' tables). Additionally, there is a table with various parameters for these documents:

create table params (
    kind integer,
    docid integer,
    parname text,
    parvalue text,
    constraint params_pk primary key (kind, docid, parname));

There may be many (parname, parvalue) pairs for one document. As kind points to different tables it cannot be used as a foreign key.

It has been working well for years as params were used only for printing documents. Now this table contains 5 millions rows and the data is needed also for other purposes. So it is high time to renew this model.

Basically params are inserted once for a document and very seldom updated. They will be read as a whole (for a document). There is no need to search for a specific parname.

I have three ideas:

Variant A. Split table params into several tables according to parent tables and use docid as foreign key.

Variant B. Split table params as in variant A and store (parname, parvalue) as hstore.

Variant C. Add a hstore field in every parent table and forget additional tables.

I have no experience with hstore. What are the cons and pros of each variant? Which one would you choose? Can hstore baffle me with something strange?


Solution

  • I vote for the third option. The fewer tables the better sleep.

    Hstore was invented for one-level parameter lists. It is stable, fast and simple, and perfectly fits your needs. I had similar task some time ago. I wrote an aggregate for easier conversion.

    create or replace function hstore_add(hstore, text, text)
    returns hstore language plpgsql 
    as $$
    begin
        return case
            when $1 isnull then hstore($2, $3)
            else $1 || hstore($2, $3) end;
    end $$;
    
    create aggregate hstore_agg (text, text) (
        sfunc = hstore_add,
        stype = hstore
    );
    

    I think it may save your time.

    select kind, docid, hstore_agg(parname, parvalue)
    from params
    group by 1, 2
    order by 1, 2;