I need help with data cleaning.
How do I make the language column(D) into form in column(F)
Basically just get rid of the brackets and apostrophe, and save the comma between each language.
It can be done either using python or excel itself
Thanks!
I tried to google but it didn't work
You could use pandas to read the csv into a dataframe then "apply" a function to the column that did something like this:
def clean(value: str) -> str:
for c in "[']":
value = value.replace(c, "")
return value
It's worth noting if you remove the brackets you'll have an "embedded" delimiter meaning you wont be able to save this as a "csv" without a few headaches.
If you want to clean it up in Excel you could, similarly to my formula above, add a formula that replaces all of the unwanted characters with an empty string with something like this:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(D2,"'",""),"[",""),"]","")
*where D2 is the first language cell
FWIW, I'd take a few minutes to play with this in pandas too -- it's always good to pick up a new skill