Data: user_choices
table
id | userid | choices |
---|---|---|
1 | 3 | 1,2,3 |
2 | 55 | 5 |
3 | 783 | 2,6,7 |
4 | 45 | 1,4 |
5 | 5 | 1,2,3,4,5,6,7 |
How can I query this table to explode the choices
column filtered only to show values 1 or 2 or 3 or 4 or 7 and group it by userid
as follows?
userid | choice |
---|---|
3 | 1 |
3 | 2 |
3 | 3 |
783 | 2 |
783 | 7 |
45 | 1 |
45 | 4 |
5 | 1 |
5 | 2 |
5 | 3 |
5 | 4 |
5 | 7 |
I have tried the following, but the results include 5
and 6
, which should be omitted.
select replace(substring(substring_index(uc.choices, ',', 1), CHAR_LENGTH(SUBSTRING_INDEX(uc.choices, ',', -1)), + 1), ',', '') choice,,
uc.userid
from user_choices uc
join (select 1 digit union all select 2 union all select 3 union all select 4 union all select 7) n on length(replace(uc.choices, ',', '')) < length(uc.choices) - n.digit
where choices regexp '[12347]'
order by uc.userid
Note: I know the regexp isn't perfect here (matches on 22
, for example). The table here won't ever receive this, so I don't think stricter regex is required.
OUTPUT:
choice | userid |
---|---|
1 | 3 |
1 | 5 |
1 | 5 |
1 | 5 |
1 | 5 |
2 | 783 |
This attempt is based on this answer to another question, but I can't quite get it working for my scenario.
Your help is much appreciated!
An alternative to the numbers table technique is to use JSON.
The trick is that, given the format of your CSV strings (numbers separated by commas), we can easily turn them to valid JSON values by just surrounding them with square brackets; eg [1,2,3]
is a valid JSON array. Then, we can use handy JSON function json_table()
to unnest the arrays to rows. The last step is filtering.
So:
select t.userid, j.choice
from mytable t
cross join json_table( concat('[', t.choices, ']'), '$[*]' columns (choice int path '$') ) j
where j.choice in (1, 2, 3, 4, 7)
order by t.userid, j.choice
userid | choice |
---|---|
3 | 1 |
3 | 2 |
3 | 3 |
5 | 1 |
5 | 2 |
5 | 3 |
5 | 4 |
5 | 7 |
45 | 1 |
45 | 4 |
783 | 2 |
783 | 7 |