I am designing a database for entities like stores, products, customers, suppliers and warehouses. The database will be used as an online transactional system in different physical stores.
In my case all the attributes of stores, customers, suppliers and warehouses can be defined as columns because they won't change much over time. However the products have an unlimited amount of attributes so I would like to put this data in an EAV model.
Can anybody point out if the values should be put in their own datatype-specific table (such as attribute_values_int, or as columns in the generic attribute_value table? Because of performance reasons Magento has chosen datatype-specific value tables. See: http://blog.magestore.com/2012/03/23/magento-certificate-eav-model-concepts
Thanks.
Frankly, the best option is "not EAV". Look into using hstore
fields, XML
, or json
.
In PostgreSQL there is no performance advantage to using per-datatype tables. NULL
values are stored in a compact NULL
bitmap, so it makes very little difference whether you have a tuple like (NULL, NULL, NULL, 42, NULL, NULL)
or just (42)
.
This also allows you to add CHECK
constraint enforcing that exactly one field must be non-NULL
, so you don't get multiple values of different types.
Demo:
regress=> CREATE TABLE eav_ugh (
entity_id integer,
int_value integer,
numeric_value numeric,
text_value text,
timestamp_value timestamp with time zone,
CONSTRAINT only_one_non_null CHECK (
(int_value IS NOT NULL AND numeric_value IS NULL AND text_value IS NULL AND timestamp_value IS NULL) OR
(int_value IS NULL AND numeric_value IS NOT NULL AND text_value IS NULL AND timestamp_value IS NULL) OR
(int_value IS NULL AND numeric_value IS NULL AND text_value IS NOT NULL AND timestamp_value IS NULL) OR
(int_value IS NULL AND numeric_value IS NULL AND text_value IS NULL AND timestamp_value IS NOT NULL)
)
);
CREATE TABLE
regress=> insert into eav_ugh (entity_id, numeric_value) select x, x from generate_series(1,5000) x;
INSERT 0 5000
regress=> select pg_relation_size('eav_ugh');
pg_relation_size
------------------
229376
(1 row)
regress=> CREATE TABLE no_null_cols(entity_id integer, numeric_value numeric);
CREATE TABLE
regress=> insert into no_null_cols (entity_id, numeric_value) select x, x from generate_series(1,5000) x;
INSERT 0 5000
regress=> select pg_relation_size('no_null_cols');
pg_relation_size
------------------
229376
(1 row)
regress=> SELECT sum(pg_column_size(eav_ugh)) FROM eav_ugh;
sum
--------
164997
(1 row)
regress=> SELECT sum(pg_column_size(no_null_cols)) FROM no_null_cols;
sum
--------
164997
(1 row)
In this case the null bitmap isn't adding any space at all, likely due to alignment requirements.