Search code examples
mysqlsqldatabase-designdatabase-schemaentity-attribute-value

Entity-Attribute-Value Table Design


I am currently designing a database structure for the products section of an ecommerce platform. It needs to be designed in such a way that makes it possible to sell an infinite number of different types of products with an infinite number of different attributes.

E.g. The attributes of a laptop would be RAM, Screen Size, Weight, etc. The attributes of a book would be Author, ISBN, Publisher, etc.

It seems like an EAV structure would be most suitable.

  • Select a product
  • Product belongs to attribute set
  • Attribute set contains attributes x and y
    • Attribute x is data type datetime (values stored in attribute_values_datetime)
    • Attribute y is data type int (values stored in attribute_values_int)
  • Each attribute definition denotes the type (i,e, x has column type -> datetype)

Assuming the above, could I join the selection to the attribute_values_datetime table to get the right data without getting the result set and building a second query now that the table is known? Would there be a large performance hit constructing a query of this type or would the below be more suitable (although less functional)

  • Select a product
  • Product belongs to attribute set
  • Attribute set contains attributes x and y
    • Attribute x is data type datetime but stored as TEXT in attribute_values
    • Attribute y is data type int but stored as TEXT in attribute_values

Solution

  • I'm going to offer a contrary opinion to most of the comments on this question. While EAV is EVIL for all of the reasons that you can find thoroughly explained many times here on SO and DBA.SE and elsewhere, there is one really common application for which most of the things that are wrong with EAV are largely irrelevant and the (few) advantages of EAV are very much germane. That application is online product catalogs.

    The main problem with EAV is that it doesn't let the database do what it is really good at doing, which is helping to give proper context to different attributes of information about different entities by arranging them in a schema. Having a schema brings many, many advantages around accessing, interpreting and enforcing integrity of your data.

    The fact about product catalogs is that the attributes of a product are almost entirely irrelevant to the catalog system itself. Product catalog systems do (at most) three things with product attributes.

    1. Display the product attributes in a list to end users in the form: {attribute name}: {attribute value}.

    2. Display the attributes of multiple products in a comparison grid where attributes of different products line up against each other (products are usually columns, attributes are usually rows)

    3. Drive rules for something (e.g. pricing) based on particular attribute/value combinations.

    If all your system does is regurgitate information that is semantically irrelevant (to the system) then the schema for this information is basically unhelpful. In fact the schema gets in the way in an online product catalog, especially if your catalog has many diverse types of products, because you're always having to go back into the schema to tinker with it to allow for new product categories or attribute types.

    Because of how it's used, even the data type of an attribute value in a product catalog is not necessarily (vitally) important. For some attributes you may want to impose contraints, like "must be a number" or "must come from this list {...}". That depends on how important attribute consistency is to your catalog and how elaborate you want your implementation to be. Looking at the product catalogs of several online retailers I'd say most are prepared to trade off simplicity for consistency.

    Yes, EAV is evil, except when it isn't.