Search code examples
mysqldatabaseperformancedatabase-normalizationdatabase-table

Is there a shortcut to normalizing a table where the columns=rows?


Suppose you had the mySQL table describing if you can mix two substances

Product   A    B    C
---------------------
A         y    n    y
B         n    y    y
C         y    y    y

The first step would be to transform it like

P1   P2   ?
-----------
A    A    y
A    B    n
A    C    y
B    A    y
B    B    y
B    C    n
C    A    y
C    B    n
C    C    y

But then you have duplicate information. (eg. If A can mix with B, then B can mix with A), so, you can remove several rows to get

P1   P2   ?
-----------
A    A    y
A    B    n
A    C    y
B    B    y
B    C    n
C    C    y

While the last step was pretty easy with a small table, doing it manually would take forever on a larger table. How would one go about automating the removal of rows with duplicate MEANING, but not identical content?

Thanks, I hope my question makes sense as I am still learning databases


Solution

  • If it's safe to assume that you're starting with all relationships doubled up, e.g.

    If A B is in the table, then B A is guaranteed to be in the table.

    Then all you have to do is remove all rows where P2 < P1;

    DELETE FROM `table_name` WHERE `P2` < `P1`;
    

    If this isn't the case, you can make it the case by going through the table and inserting all the duplicate rows if they don't already exist, then running this.