Search code examples
mysqlsplitconcatenationgroup-concat

MySQL split on semicolon in a column to form a new row for each split


So I've been given legacy mysql db with data that looks like this:

   DATE     |     DATA
2015-04-27  | sample; sample2; sample3
2015-04-28  | sample1; sample4

I'm looking to create a new table that would store date/data pairs like this.

   DATE     |     DATA
2015-04-27  | sample
2015-04-27  | sample2
2015-04-27  | sample3
2015-04-28  | sample1
2015-04-28  | sample4

I've tried looking into user defined split functions, but they all seem to just want to split on a char and return an instance of the splits. I need to keep all instances to put in new table.

So using this: http://blog.fedecarg.com/2009/02/22/mysql-split-string-function/

I created the split function and have tried altering the return:

REPLACE(SUBSTRING(SUBSTRING_INDEX(list, ';'),
LENGTH(SUBSTRING_INDEX(list, ';', pos -1)) + 1),
delim, '')

But it requires a third parameter to specify which instance you want to return. Is there a way to return all instances of new rows with the date from left column?


Solution

  • Here's one idea...

    DROP TABLE IF EXISTS my_table;
    
    CREATE TABLE my_table 
    ( date DATE NOT NULL
    , data VARCHAR(100) NOT NULL
    );
    
    INSERT INTO my_table VALUES
    ('2015-04-27','sample; sample2; sample3'),
    ('2015-04-28','sample1; sample4');
    
    SELECT * FROM my_table;;
    +------------+--------------------------+
    | date       | data                     |
    +------------+--------------------------+
    | 2015-04-27 | sample; sample2; sample3 |
    | 2015-04-28 | sample1; sample4         |
    +------------+--------------------------+
    
    SELECT * FROM ints;
    +---+
    | i |
    +---+
    | 0 |
    | 1 |
    | 2 |
    | 3 |
    | 4 |
    | 5 |
    | 6 |
    | 7 |
    | 8 |
    | 9 |
    +---+
    
    SELECT DISTINCT date
                  , data
                  , TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(data,';',i+1),';',-1)) x 
               FROM ints
                  , my_table
              ORDER 
                 BY date,i;
    
    +------------+--------------------------+---------+
    | date       | data                     | x       |
    +------------+--------------------------+---------+
    | 2015-04-27 | sample; sample2; sample3 | sample  |
    | 2015-04-27 | sample; sample2; sample3 | sample2 |
    | 2015-04-27 | sample; sample2; sample3 | sample3 |
    | 2015-04-28 | sample1; sample4         | sample1 |
    | 2015-04-28 | sample1; sample4         | sample4 |
    +------------+--------------------------+---------+