I have been doing a bit of analysis of both qualtrics and Google forms surveys with Pandas.
Some of the questions are of the format:
what do you like about cake? (select as many as you need to)
In both systems they produce a column that looks like:
| cake 🍰 | ramen 🍜 |
| 1, 3, 4| love |
| 1 | hate |
| 3, 4 | love |
and so on. Both systems do automatic barcharts of the responses, but they are hard to work with.
I've done it in the past by breaking them into extra columns, or just processing everything on the fly and building a temporary dataframe for a specific graph.
Is there a more elegant method of handling columns like this? Particularly so that I can do stacked bar charts of cake feelings, broken up by how they feel about ramen (for example )
most solutions to similar problems require creating a new dataframe. example:Pandas column of lists, create a row for each list element
If you don't want to do that - just unpack the lists. A function is needed to deal with uneven list depth:
tolist = lambda a: a if type(a)==list else [a]
[a for b in df['cake'].values for a in tolist(b)]
[1, 3, 4, 2, 3, 4]