All the data I crunch needs to be reported with comma as thousands separator. I'm only interested in values with comma as thousands separator after the data is written to a .csv file. Internally within my pandas dataframe, I want to keep them as int or float.
I can output the floats to zero decimal with:
df.to_csv('c:\Awesome\Groovy.csv', float_format = '%.0f')
but whenever I try to put a comma in the float it doesn't work.
df.to_csv('c:\Awesome\Groovy.csv', float_format = ':,.0f') ## WRONG
df.to_csv('c:\Awesome\Groovy.csv', float_format = {:,.0f}'.format) ## WRONG
And my plan for the ints, is first to convert them to float in the dataframe and then format them with the .to_csv function. Any thoughts?
This might be too kludgy for you. At any rate here goes. Using this answer https://stackoverflow.com/a/4205875/42346 we can have our commas:
def thous(x, sep=',', dot='.'):
num, _, frac = str(x).partition(dot)
num = re.sub(r'(\d{3})(?=\d)', r'\1'+sep, num[::-1])[::-1]
if frac:
num += dot + frac
return num
df['my_column'] = df['my_column'].apply(lambda x: thous(x))
Using some sample data I had lying around from another SO question:
>>> df = pd.DataFrame({'date':['2017-03-10','2017-03-11','2017-03-12'],
'activate_time':['2017-03-10 12:13:30','2017-03-11 13:57:49','2017-03-12 14:28:05'],
'mycol':[1234.56789,9876.54321,1111111.11111]})
>>> df['mycol'] = df['mycol'].apply(lambda x: thous(x))
>>> df
activate_time date mycol
0 2017-03-10 12:13:30 2017-03-10 1,234.56789
1 2017-03-11 13:57:49 2017-03-11 9,876.54321
2 2017-03-12 14:28:05 2017-03-12 1,111,111.11111