I have the following dataframe called data
:
metrics artists
0 0.21 ['Zhané']
2 0.14 ['Mose Allison']
3 0.87 ['水柳仙']
4 0.25 ['Shel Silverstein']
Some records of the column "artists" have special characters, I want to make another df with the records that have special characters, that is, the following output:
data:
metrics artists
0 0.14 ['Mose Allison']
1 0.25 ['Shel Silverstein']
data2:
metrics artists
0 0.21 ['Zhané']
1 0.14 ['水柳仙']
use:
data2=data.artists[data.artists.str.contains("[^a-zA-Z0-9]")]
but I get the original df,
I also tried with:
data2 = []
for x in data['artists']:
if x is not "[^a-zA-Z0-9 ]":
data2[x]=data[x]
print(data2)
but it gives me the error:
KeyError: "['Zhané']"
and with:
if x is "[^ a-zA-Z0-9]"
returns empty records.
use:
data2=data.artists[data.artists.str.contains("[^a-zA-Z0-9]")]
but I get the original df,
You're missing a space in "[^a-zA-Z0-9]" which is why you're getting the original df. Tested with Python3 in a Jupyter notebook.