Let's say we're having an application which should be able to store all kind of products. Each product has at least an ID
and a Name
but all other attributes can be defined by the user himself.
How would you model your schema in respect to the above requirements?
Note: Requirment 4. is important!
Thanks everyone for contributing and discussing the approach. I have seen some solutions to this problem in the past but none of them made grouping easy for me :(
The grouping is not going to be easy because what aggregate operator are you going to use on "color"? Note that it is not possible to use your requirement 4 on case 2.
In any case, the aggregating is only difficult because of the variation in data types and can be mitigated by approaching it in a more typesafe way - knowing that it never makes sense to add apples and oranges.
This is the classic EAV model and it has a place in databases where carefully designed. In order to make it a bit more typesafe, I've seen cases where the values are stored in type-safe tables instead of in a single free form varchar column.
Instead of Values:
EntityID int
,AttributeID int
,Value varchar(255)
You have multiple tables:
EntityID int
,AttributeID int
,ValueMoney money
EntityID int
,AttributeID int
,ValueInt int
etc.
Then to get your iPod capacity per generation:
SELECT vG.ValueVarChar AS Generation, SUM(vC.ValueDecimal) AS TotalCapacity
FROM Products AS p
INNER JOIN Attributes AS aG
ON aG.AttributeName = 'generation'
INNER JOIN ValueVarChar AS vG
ON vG.EntityID = p.ProductID
AND vG.AttributeID = aG.AttributeID
INNER JOIN Attributes AS aC
ON aC.AttributeName = 'capacity'
INNER JOIN ValueDecimal AS vC
ON vC.EntityID = p.ProductID
AND vC.AttributeID = aC.AttributeID
GROUP BY vG.ValueVarChar