Search code examples
databaseformsdatabase-designmodelentity-attribute-value

EAV database design for model with multiple attributes (other models)


I have a model that possesses a lot of attributes with multiple values that are either representations of lists or other models. My research led me to consider an Entity-Attribute-Value design to represent such but I have seen more discouragement from more knowledgeable people than recommendations.

One that sticks to me is this comment:

In a nutshell, EAV is useful when your list of attributes is frequently growing, or when it's so large that most rows would be filled with mostly NULLs if you made every attribute a column. It becomes an anti-pattern when used outside of that context.

by Karl Bielefeldt.

Basically my model is student_report. It has the following attributes based on the actual form:

  • id
  • creator
  • revision history
  • department
  • references
  • funding (optional, variable/not fixed)
  • comments
  • objectives (paragraph)
  • scope (paragraph)

creator, revision history, department, references, funding and comments are other models that this form will rely on.

My initial plan is to create student_report with only the following:

  • id
  • id of creator
  • objectives
  • other paragraph-style content

while the others: revision history, department, references, funding and comments will posses the foreign key student_report_id.

For the variable/not fixed models such as references and funding, I plan to use a mediator table to connect student_form to the "list" of those to normalize the DB:

  • student_report

    | id | name            |
    |----|-----------------|
    | 1  | Abraham Smith   |
    | 2  | Betty Gladstone |
    | 3  | Chen Hong       |
    
  • references

    | id | name         |
    |----|--------------|
    | 1  | Reference 1  |
    | 2  | Reference 2  |
    | 10 | Reference 10 |
    
  • report_references

    | user_id | reference_id |
    |---------|--------------|
    | 1       | 2            |
    | 1       | 3            |
    | 2       | 10           |
    

Is my proposed solution enough? This will be a small-scale project and I doubt this will require hundreds of use a day.


Solution

  • EAV helps you capture data when the data model is not well understood. It allows you to skip over data analysis and to come up with a single design that is so adaptable that it will handle a body of data no matter what the actual structure of the data is.

    But there's a downside. Since you haven't analyzed the data at storage time, you have to analyze the data when you go to retrieve it and turn it into something useful, such as a report or an extract. Otherwise your results are meaningless. This downside can, in some circumstances be much larger than the upside you experienced earlier.

    In your case, you seem to have a good understanding of the attributes you want to store, and the semantics of those attributes. It also looks unlikely that the attribute list is going to have to expand based on surprises.

    So I advise you to avoid EAV, and instead concentrate on how to compose relations out of attributes. Relations are simply collections of attributes, grouped together in some way that is meangful and useful. You can read books about this topic, if you care to.

    In SQL, tables represent relations. Tables have rows, that represent tuples. Tables have columns, that represent attributes. The intersection of a row and a table provides a location where a value can be stored. In !NF, each location stores a "simple" value.

    Your design looks pretty good to me. I think it will serve you better than an EAV model would. I don't know whether it's completely normalized, and I'm not sure it has to be.