I have a pandas dataframe like as below
Country_list
{'INDIA': '98.31%', 'ASEAN': '1.69%'}
{'KOREA': '100.0%'}
{'INDIA': '95.00%', 'ASEAN': '2.50%','ANZ': '2.50%'}
{'INDIA': '95.00%', 'ASEAN': '2.50%','ANZ': '1.25%','KOREA': '1.25%'}
I would like to do the below
a) Replace all numbers and special characters with '' (no blanks)
b) insert new character - Comma between different region names
I tried the below but this doesn't seem efficient or elegant
df['Country_list'] = df['Country_list'].str.replace(r":",'', regex=True).str.replace(r"%", '', regex=True).str.replace(r"{",'', regex=True).str.replace(r"}",'', regex=True)
I expect my output to be like as below
INDIA,ASEAN
KOREA
INDIA,ASEAN,ANZ
INDIA,ASEAN,ANZ,KOREA
If need join keys of dictionaries by ,
convert strings to dictionaries and join keys in list comprehension:
import ast
df['Country_list'] = [','.join(ast.literal_eval(x).keys()) for x in df['Country_list']]
print (df)
Country_list
0 INDIA,ASEAN
1 KOREA
2 INDIA,ASEAN,ANZ
3 INDIA,ASEAN,ANZ,KOREA
Alternative solution:
import ast
f = lambda x: ','.join(ast.literal_eval(x).keys())
df['Country_list'] = df['Country_list'].apply(f)