Search code examples
mysqlsqlperformanceentity-attribute-value

Entity attribute value model - Performance alternative?


I work with PHP and mySQL.

I have a page table and a meta table. It looks a little bit like this.

Page table

page_id | headline    | content
--------------------------
1       | My headline | My content
2       | Another one | Another text

Meta table

id | page_id | meta_key  | meta_value
------------------------------------
1  | 2       | seo_title | Hello world
2  | 2       | price     | 299

I've read that this type of model is called EAV. I also read that it is bad for performance.

My meta table is made for any kind of value connected to a page. I can not created a table with "static" columns this time.

Question

  • How bad is this for 300 pages with 30 meta values on each page? 9000 rows in the meta table that is.
  • Is there a better model for "dynamic" data?

Solution

  • First, sometimes this model makes querying data much easier. I asked a question couple of days ago here and some users suggested why I didn't change my model to a 1NF form to make querying data easier. Only when they realized I was stuck with this design, they provided some answers to the question. The point is I was lucky enough to have only 12 columns to be summed up; otherwise, if my table contained 300 columns, perhaps no user bothered themselves to write a query for that problem. :-)

    Second, sometimes the implementation of this design is easier due to some limitations naturally imposed by databases. If your meta_key values contain some lengthy values larger than 30 characters, either you have to shorten the values and do the mapping somewhere or this would possibly be the only option you could have.

    Finally, performance is very important; that's true. But, on the other hand, there are certain techniques you could apply to improve the performance; such as by creating proper indexes, partitioning tables, and so on.

    In this case, the table sizes are very small. So, unless your queries are very complicated such as having heavy calculations and complicated joins and aggregations, and if the application is not sensitive to small time fractions, I guess you wouldn't suffer from performance if adopted this model.

    At the end, if you are still too much concerned about the performance, I would suggest create both models, populate them with some random or real data, and analyze the plan costs to see what model better suits your needs.