I first tried solving my problem in Excel with no easy fix, so decided to give it a shot in SQL (on PostgreSQL/pgAdminIII), where I am a beginner, and where I didn't find a satisfying solution.
My goal is to "flatten" a data set containing similar attributes in a single row, which should have a row of their own.
An example might clarify. My data lists shopping bags and their contents as follows:
id material color fruit1 fruit2 fruit3
1 cotton red apple banana cherry
2 paper blue apple cherry
3 plastic red banana
I need to create a table with a new line for each fruit, so the result of the query should be something like:
id material color fruit
1 cotton red apple
1 cotton red banana
1 cotton red cherry
2 paper blue apple
2 paper blue cherry
3 plastic red banana
So far, I came up with a query involving CASE, but this only returns the first match, and so doesn't return all the needed lines.
SELECT
id,
(CASE
WHEN 'apple' IN(fruit1, fruit2, fruit3) THEN 'apple_exh'
WHEN 'banana' IN(fruit1, fruit2, fruit3) THEN 'banana_exh'
WHEN 'cherry' IN(fruit1, fruit2, fruit3) THEN 'cherry_exh'
ELSE 'Error'
END) as "Fruit_Here"
FROM
mydb.shopping
WHERE
'apple' IN(fruit1, fruit2, fruit3)
OR
'banana' IN(fruit1, fruit2, fruit3)
OR
'cherry' IN(fruit1, fruit2, fruit3)
ORDER BY id;
Returns
id; fruit_here
1;"apple_exh"
2;"apple_exh"
3;"banana_exh"
It would be really nice if a trick existed to allow CASE to return all matches, and not just the first. My current workaround using a sequence of CASE and UNION ALL (see below example for apples and bananas) works, but is unrealistically tedious, as my full data includes around 30 fruits (and possibly I should apply the same "flattening" to vegetables, also initially on a single row).
SELECT
id,
(CASE
WHEN 'apple' IN(fruit1, fruit2, fruit3) THEN 'apple_exh'
ELSE 'Error'
END) as "Fruit_Here"
FROM
mydb.shopping
WHERE
'apple' IN(fruit1, fruit2, fruit3)
UNION ALL
SELECT
id,
(CASE
WHEN 'banana' IN(fruit1, fruit2, fruit3) THEN 'banana_exh'
ELSE 'Error'
END) as "Fruit_Here"
FROM
mydb.shopping
WHERE
'banana' IN(fruit1, fruit2, fruit3)
UNION ALL
SELECT
id,
(CASE
WHEN 'cherry' IN(fruit1, fruit2, fruit3) THEN 'cherry_exh'
ELSE 'Error'
END) as "Fruit_Here"
FROM
mydb.shopping
WHERE
'cherry' IN(fruit1, fruit2, fruit3)
ORDER BY id, "Fruit_Here";
Returns
id; fruit_here
1;"apple_exh"
1;"banana_exh"
1;"cherry_exh"
2;"apple_exh"
2;"cherry_exh"
3;"banana_exh"
My question: Is there any other obvious way to perform this task in SQL, without having to duplicate code for each type of fruit?
You only need a select statement for each column:
select id, material, color, fruit1 from mydb.shopping where fruit1 is not null
union
select id, material, color, fruit2 from mydb.shopping where fruit2 is not null
union
select id, material, color, fruit3 from mydb.shopping where fruit3 is not null