Search code examples
mysqlpivotgroup-concat

How to use GROUP_CONCAT to pivot and generate concatenated column names dynamically


I have a table with antiviral activities for compounds in different cell lines infected with different viruses, like this:

ID  Batch_ID    Cell_Line   Virus   Conc    Effect
1   abc123      U87         ZIKV    5       67
2   abc123      SW13        HAZV    10      35
3   def456      U87         ZIKV    5       85
4   def456      SW13        HAZV    10      15

I would like to convert this to a View that looks like this:

ID  Batch_ID    U87_ZIKV_5  SW13_HAZV_10
1   abc123      67          35
2   def456      85          15

So basically a pivot that generates new columns and corresponding names by dynamically concatenating the original row values for Cell_Line, Virus and Conc, and then report the Effect for each compound. For a limited number of combinations this could of course be 'hard-coded' but I want to allow for the possibility of additional cell lines, virus strains and concentrations to be added later on, and hence for the pivoted column names to be generated dynamically.

Grateful for any pointers on how to code this efficiently in mySQL!


Solution

  • So, a hard-coded solution for your table is pretty simple:

    select Batch_ID,
      SUM(CASE WHEN CONCAT(Cell_Line,'_',Virus,'_',CONVERT(Conc,CHAR))= 'U87_ZIKV_5' THEN Effect END) U87_ZIKV_5,
      SUM(CASE WHEN CONCAT(Cell_Line,'_',Virus,'_',CONVERT(Conc,CHAR)) = 'SW13_HAZV_10' THEN Effect END) SW13_HAZV_10
    from viruses
    group by Batch_ID;
    

    But if you want to make it flexible, you should use the thing called Dynamic pivot tables. I used the approach discussed here and came up with this MySQL code that should solve your problem:

    SET @sql = NULL;
    SELECT
      GROUP_CONCAT(DISTINCT
        CONCAT(
          'SUM(CASE WHEN CONCAT(Cell_Line,\'_\',Virus,\'_\',CONVERT(Conc,CHAR)) =  ''',
          CONCAT(Cell_Line,'_',Virus,'_',CONVERT(Conc,CHAR)),
          ''' THEN Effect END) AS ',
          CONCAT(Cell_Line,'_',Virus,'_',CONVERT(Conc,CHAR))
        )
      ) INTO @sql
    from viruses;
    
    SET @sql = CONCAT('SELECT Batch_ID, ', @sql, ' from viruses
    group by Batch_ID');
    
    PREPARE stmt FROM @sql;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;