Search code examples
sqlpostgresqlcasepgadminflatten

Simple SQL query to flatten multiple attributes in a row


I first tried solving my problem in Excel with no easy fix, so decided to give it a shot in SQL (on PostgreSQL/pgAdminIII), where I am a beginner, and where I didn't find a satisfying solution.

My goal is to "flatten" a data set containing similar attributes in a single row, which should have a row of their own.

An example might clarify. My data lists shopping bags and their contents as follows:

id material color fruit1 fruit2 fruit3 
1  cotton   red   apple  banana cherry
2  paper    blue  apple  cherry
3  plastic  red   banana

I need to create a table with a new line for each fruit, so the result of the query should be something like:

id material color fruit  
1  cotton   red   apple  
1  cotton   red   banana 
1  cotton   red   cherry
2  paper    blue  apple
2  paper    blue  cherry
3  plastic  red   banana

So far, I came up with a query involving CASE, but this only returns the first match, and so doesn't return all the needed lines.

SELECT  
  id,
    (CASE 
        WHEN 'apple' IN(fruit1, fruit2, fruit3) THEN 'apple_exh'        
        WHEN 'banana' IN(fruit1, fruit2, fruit3) THEN 'banana_exh'
        WHEN 'cherry' IN(fruit1, fruit2, fruit3) THEN 'cherry_exh'              
        ELSE 'Error'
    END) as "Fruit_Here"
FROM 
  mydb.shopping
WHERE 
 'apple' IN(fruit1, fruit2, fruit3)
 OR
 'banana' IN(fruit1, fruit2, fruit3)
 OR
 'cherry' IN(fruit1, fruit2, fruit3)

ORDER BY id;

Returns

id; fruit_here
1;"apple_exh"
2;"apple_exh"
3;"banana_exh"

It would be really nice if a trick existed to allow CASE to return all matches, and not just the first. My current workaround using a sequence of CASE and UNION ALL (see below example for apples and bananas) works, but is unrealistically tedious, as my full data includes around 30 fruits (and possibly I should apply the same "flattening" to vegetables, also initially on a single row).

SELECT  
  id,
    (CASE 
        WHEN 'apple' IN(fruit1, fruit2, fruit3) THEN 'apple_exh'                        
        ELSE 'Error'
    END) as "Fruit_Here"
FROM 
  mydb.shopping
WHERE 
 'apple' IN(fruit1, fruit2, fruit3)

UNION ALL

SELECT  
  id,
    (CASE 
        WHEN 'banana' IN(fruit1, fruit2, fruit3) THEN 'banana_exh'
        ELSE 'Error'
    END) as "Fruit_Here"
FROM 
  mydb.shopping
WHERE 
 'banana' IN(fruit1, fruit2, fruit3)

 UNION ALL

 SELECT  
  id,
    (CASE 
        WHEN 'cherry' IN(fruit1, fruit2, fruit3) THEN 'cherry_exh'
        ELSE 'Error'
    END) as "Fruit_Here"
FROM 
  mydb.shopping
WHERE 
 'cherry' IN(fruit1, fruit2, fruit3)

ORDER BY id, "Fruit_Here";

Returns

id; fruit_here
1;"apple_exh"
1;"banana_exh"
1;"cherry_exh"
2;"apple_exh"
2;"cherry_exh"
3;"banana_exh"

My question: Is there any other obvious way to perform this task in SQL, without having to duplicate code for each type of fruit?


Solution

  • You only need a select statement for each column:

    select id, material, color, fruit1 from mydb.shopping where fruit1 is not null
    union 
    select id, material, color, fruit2 from mydb.shopping where fruit2 is not null
    union 
    select id, material, color, fruit3 from mydb.shopping where fruit3 is not null