Search code examples
pythonpandasdataframepivotdummy-variable

Pivot a column containing list


I have a DataFrame called dfGenres with one column Genres looking like:

enter image description here

Code for the DataFrame:

dfGenres = pd.DataFrame({'Genres':["[Action, Romance, Mystery, Animal]", "[Bromance, Drama, Mystery]", "[Horror, Monsters, Shounen, Action]"]})

Pivot means that I would like to have the same number of rows but in each column every unique element from the lists, filled with 1 if the row contains the genre in the list, 0 otherwise.

I tried something with get_dummies:

pd.get_dummies(
    dfGenres["Genres"].str.split('[\[,\]]',expand=True).stack()
    )

but I get too much rows (I only want one and not a multiIndex):

enter image description here


Solution

  • Try with str.get_dummies:

    dfGenres["Genres"].str.strip('[]').str.get_dummies(sep=', ')