Search code examples
mysqlpostgresqldatabase-designentity-attribute-value

Should I place EAV values in a datatype table?


I am designing a database for entities like stores, products, customers, suppliers and warehouses. The database will be used as an online transactional system in different physical stores.

In my case all the attributes of stores, customers, suppliers and warehouses can be defined as columns because they won't change much over time. However the products have an unlimited amount of attributes so I would like to put this data in an EAV model.

Can anybody point out if the values should be put in their own datatype-specific table (such as attribute_values_int, or as columns in the generic attribute_value table? Because of performance reasons Magento has chosen datatype-specific value tables. See: http://blog.magestore.com/2012/03/23/magento-certificate-eav-model-concepts

Thanks.


Solution

  • Frankly, the best option is "not EAV". Look into using hstore fields, XML, or json.

    In PostgreSQL there is no performance advantage to using per-datatype tables. NULL values are stored in a compact NULL bitmap, so it makes very little difference whether you have a tuple like (NULL, NULL, NULL, 42, NULL, NULL) or just (42).

    This also allows you to add CHECK constraint enforcing that exactly one field must be non-NULL, so you don't get multiple values of different types.

    Demo:

    regress=> CREATE TABLE eav_ugh (
        entity_id integer,
        int_value integer,
        numeric_value numeric,
        text_value text,
        timestamp_value timestamp with time zone,
        CONSTRAINT only_one_non_null CHECK (
                (int_value IS NOT NULL AND numeric_value IS NULL AND text_value IS NULL AND timestamp_value IS NULL) OR
                (int_value IS NULL AND numeric_value IS NOT NULL AND text_value IS NULL AND timestamp_value IS NULL) OR
                (int_value IS NULL AND numeric_value IS NULL AND text_value IS NOT NULL AND timestamp_value IS NULL) OR
                (int_value IS NULL AND numeric_value IS NULL AND text_value IS NULL AND timestamp_value IS NOT NULL)
        )
    );
    CREATE TABLE
    regress=> insert into eav_ugh (entity_id, numeric_value) select x, x from generate_series(1,5000) x;
    INSERT 0 5000
    regress=> select pg_relation_size('eav_ugh');                                           
     pg_relation_size 
    ------------------
               229376
    (1 row)
    
    regress=> CREATE TABLE no_null_cols(entity_id integer, numeric_value numeric);
    CREATE TABLE
    regress=> insert into no_null_cols (entity_id, numeric_value) select x, x from generate_series(1,5000) x;
    INSERT 0 5000
    regress=> select pg_relation_size('no_null_cols');
     pg_relation_size 
    ------------------
               229376
    (1 row)
    
    regress=> SELECT sum(pg_column_size(eav_ugh)) FROM eav_ugh;
      sum   
    --------
     164997
    (1 row)
    
    regress=> SELECT sum(pg_column_size(no_null_cols)) FROM no_null_cols;
      sum   
    --------
     164997
    (1 row)
    

    In this case the null bitmap isn't adding any space at all, likely due to alignment requirements.