I'm trying to decode my dataframe through the following code:
df = pd.read_sql_table('mytable',con)
for column in df.columns :
for i in range(len(df[column])):
if type(df[column][i]) == bytearray or type(df[column][i]) == bytes:
df[column][i] = str(df[column][i], 'utf-8')
but I keep getting SettingWithCopy warnings no matter what I try
Anyone know how to deal with this warning ?
I've end up settling for this:
if df[column].dtype == 'object':
df[column] = df[column].apply(lambda x: x.decode('utf-8') if isinstance(x, bytes) else x)
Thanks for the help!
A few ways to improve this:
pd.Series.astype()
method which is more efficient than str()
as it is vectorized (i.e. you can call it on the whole Series)..loc
to avoid the setting with copy warning.So your code will look like:
for column in df.columns :
df.loc[column, :] = df[column].astype(str)
Note that str
type will be encoded as utf-8
in all but very old versions of Python. However if you are using 2.x you can do df[column].astype('unicode')
.