Say I have a dataframe which looks as follows. The values within the column Value
are decimals.
df.head()
ID Key Value
0 A0AVT1 MAHA 4842000
1 A0FGR8 MAHA 3522710
2 A0JLT2 MAHA 283,433
3 A0JNW5 MAHA 356,09677
4 A0MZ66 CEB 37,5
5 A0PJW6 CEB 487,03677
6 A1AG CEB 10,625567
7 A1L0T0 HAC 12
8 A1L390 HAC 63,946
9 A1X283 HAC 138,25
And I want to use the pandas pivot_tables
to cast the above dataframe, by using ID
as index and Key
as the columns with values from column Value
. And so I tried the following one liner:
df2.reset_index().pivot_table(values='Value',index='ID',columns='Key')
However, the above one liner is throwing this data error:
~/software/anaconda/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
4042
4043 if len(new_blocks) == 0:
-> 4044 raise DataError('No numeric types to aggregate')
4045
4046 # reset the locs in the blocks to correspond to our
DataError: No numeric types to aggregate
Further, I have tried to use the module locale
to convert the ,
in the Value
column in my dataframe df
. Here is what I have tried:
import locale
locale.setlocale(locale.LC_ALL, 'de_DE') #Germany
df.Value.astype(str).apply(locale.atof)
And it is throwing the error:
TypeError: data type not understood
I have tried using astype (float). It did not change anything.
Any help/suggestions are much appreciated! Thank you.
The universal way is to set the locale correctly is to let the system find it out from the enviroment:
locale.setlocale(locale.LC_NUMERIC, '')
This yields on my machine:
>>> locale.setlocale(locale.LC_NUMERIC, '')
'de_DE.UTF-8'
>>> df.Value.apply(locale.atof)
0 4.842000e+06
1 3.522710e+06
2 2.834330e+02
3 3.560968e+02
4 3.750000e+01
5 4.870368e+02
6 1.062557e+01
7 1.200000e+01
8 6.394600e+01
9 1.382500e+02
If you want to set the locale explicitely, you'll have to use different locale strings for Linux and Windows:
Linux:
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF8') # or 'de_DE.UTF-8'
Windows:
locale.setlocale(locale.LC_NUMERIC, 'German') # or 'de' or 'deu' (case insensitive)