I'm at somewhat of a disagreement with a fellow colleague on this issue. First and foremost, the environment (guaranteed, will not change):
And now, the scenario: I am working on an application which will give users the ability to collect input from users and store it for later reporting purposes. The methods of collecting input are in various types of controls they can place on the page. Collecting the data is not the issue; storing it is.
Database Within A Database
I follow the notion that it should be stored as a collection of key-value pairs (in this example, a table called ResponseItems
):
+----------+----------+----+-----+
|ResponseId|FormItemId|Name|Value|
+----------+----------+----+-----+
Where the ResponseId
and FormItemId
form the primary key, the ResponseId
relating to a Response
table, as such:
+----------+------+------+-------------+
|ResponseId|FormId|UserId|SubmittedDate|
+----------+------+------+-------------+
And the FormItemId
relating to a FormItem
table:
+----------+------+----+-----------------+
|FormItemId|FormId|Type|InstantiationData|
+----------+------+----+-----------------+
Where FormId
is the same as the FormId
in the Response
table, referencing a Form
table, which is not as important to illustrate.
XML Serialization
I get back in response that serializing the form components and responses as XML makes more sense, such that it requires less bulk in terms of tables, leaving only a Form table:
+------+--------+
|FormId|Elements|
+------+--------+
and all instantiation data is stored within the XML data for each form element in the Elements
. The responses are then similarly recorded:
+----------+------+-------------+----------------+
|ResponseId|UserId|SubmittedDate|ResponseElements|
+----------+------+-------------+----------------+
The Short Question After The Long Preface
I think the first idea is cleaner and easier to report on. What do you think?
As a final note, there are more tables to this, for both sides, as the forms are to be versioned for every time a user edits the form.
The first approach is called Entity-Attribute-Value (EAV). In my opinion this is never a good way to store relational data. You should read an article called Bad CaRMa before you adopt that design.
I also talk about the disadvantages of EAV in my presentation Practical Object-Oriented Models in SQL and in my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Of these two designs, I would prefer the XML approach. It's basically Martin Fowler's Serialized LOB pattern.
You might also want to read How FriendFeed Uses MySQL to Store Schema-less Data. Their approach is not specific to MySQL; it would work in SQL Server or any other RDBMS.