I've a pandas DataFrame (list of video games) with a column "classification". In that column, we can find:
You have noticed? There is no separator...
Of course, I need to split this in a new column, WITH separator (Or other structure with each separate element).
So
"Action Adventure RPG Roguelike" => "Action, Adventure, RPG, Roguelike"
"Action Shoot'em Up Wargame" => "Action, Shoot'em Up, Wargame"
I can't use space to split, nor Caps ("Shoot'em Up
" is ONE value).
So, in my mind, I need to create a function to apply to this column, and check from a list of values (made by hand), find all of occurrence and return the string with separator...
Something like that:
classification = ["Action", "Adventure", "RPG", "Roguelike", "Shoot'em Up", "Wargame"...]
def magic_tric(data):
# do the magic, comparing each classification possible / data
return data_separated
But I do not know how to do it. In the most efficient way...
Can someone help me...? Thanks in advance.
here's an idea..using str.findall
0
0 Action Adventure RPG Roguelike
1 Action Shoot'em Up Wargame
sep = ["Action", "Adventure", "RPG", "Roguelike", "Shoot'em Up", "Wargame"]
pattern = '|'.join(sep)
pd.DataFrame(df[0].str.findall(pattern).tolist())
0 1 2 3
0 Action Adventure RPG Roguelike
1 Action Shoot'em Up Wargame None