I want to normalize a staging table into two tables. One of the fields of input data contains multiple values delimited by a semicolon.
DATETIME | FILENAME | MULTIVALUEDFIELD
7-22-18 | somefile.txt | value1;value2;value3
I'm sure you data-wizards know where this is going, but I want this normalized thusly:
ID | FILENAME |
1 | somefile.txt
and
ID | SINGLEVALUES | OTHERTABLEID
1 | value1 | 1
2 | value2 | 1
3 | value3 | 1
Sorry for formatting issues: I didn't see a way to do tables. Any help would be awesome!
It sounds like you actually have a many-to-many relationship here; a filename
can refer to many singlevalues
, and singlevalues
can appear in many filename
s.
The traditional way to represent that would be with a bridge table that consists of foreign key references to the primary keys of filename
and singlevalues
, and, if it's applicable to your design, any additional fields that represent the relationship.
dbo.filename
+-------------+---------------+
| filename_sk | filename |
+-------------+---------------+
| 1 | somefile.txt |
| 2 | otherfile.txt |
+-------------+---------------+
dbo.value
+----------+--------+
| value_sk | value |
+----------+--------+
| 1 | value1 |
| 2 | value2 |
| 3 | value3 |
+----------+--------+
dbo.bridge
+-------------+----------+
| filename_sk | value_sk |
+-------------+----------+
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 3 |
+-------------+----------+