Search code examples
sqlmagentoentity-attribute-value

Why magento store EAV data across attribute type?


In magento system, there are 5 tables to store EAV data across attribute type. Is it an effective performance choice to do it? When I make a sql query, I still need to use UNION clause to get the whole data set. If I use one mixed table to store EAV data according by the only one data type(varchar, or sql_variant in sql server2008), what will I encounter performance issue in future?

Magento EAV diagram


Solution

  • Is it an effective performance choice to do it?

    The Magento developers chose to use an EAV structure because it performs well under high volumes of data. A flat table structure would be suitable for a small setup, but as it scales it becomes less and less efficient.

    When I make a sql query, I still need to use UNION clause to get the whole data set.

    You should try in every possible case to avoid direct SQL queries on a Magento database. They have Setup models that you can use for installing new data, of which you either utilise the Magento models that already exist and the methods that the Setup models have to create/modify the Magento core config data or variables, or use the underlying Zend framework's ORM if you need to to create new tables, etc.

    In terms of the EAV part of the database specifically, the way it is setup is complicated if you attack it from the SQL point-of-view, which is why Magento models exist so that it can be all wrapped up in PHP ORM. Again, avoid SQL queries if you can.

    If you have to make direct queries, you wouldn't be creating UNION queries but joins onto those tables, and you'd use the eav_attribute table as a pivot table to provide you with both the attribute_id (primary key) and the source table which the value will exist in.

    By using direct SQL queries you also lose the fallback system that Magento implements where store or website level values can exist, and the Magento models will select them if you ask for them at a store level. If you want to do this manually with SQL then the queries become more complicated as you need to look for those values and if they aren't found, revert to the default (global scope) value.

    If I use one mixed table to store EAV data according by the only one data type(varchar, or sql_variant in sql server2008), what will I encounter performance issue in future?

    As mentioned before, it depends on the expected scale of your database. You will notice that there are plenty of flat tables in a standard Magento database, and that EAV structures only apply to the parts that Magento developers have decided may increase drastically in volume (customers, catalog etc).

    If you want to implement a custom module and you think that it also has the potential to grow quickly over time then you can implement your own EAV tables for it. The Magento model scaffolds support this, and there is plenty of resource online about how to set them up.

    If your tables are likely to remain (relatively) small, then by all means go for a flat table approach. If it's a custom module and you notice rapid growth, you can always convert it later before it becomes a bottleneck.