After constructing data frame df
containing np.nan
, None
and empty string (''
) values in Python 3.8.3 and Pandas 1.0.4
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'b':None,'c':''},index=[0])
df2 = pd.DataFrame({'a':1,'b':1,'c':None},index=[0])
df = pd.concat([df1,df2], axis=0, sort=True)
print(df)
data frame df
looks like
a b c
0 NaN None
0 1.0 1 None
Now I would like to store the values to Excel using to_excel()
function. However, after running command
df.to_excel('nan_none_empty.xlsx')
the result rather looks like
with np.nan
, None
and empty string (''
) not being separable from each other.
It is possible to separate empty string (''
) from np.nan
and None
with option na_rep
as follows
df.to_excel('nan_none_empty2.xlsx',na_rep='?')
giving result
But it seems like, for this problem, I am running out of options for to_excel(...) function in order to be able to separate np.nan
from None
in the Excel export.
How could one neatly separate between np.nan
and None
when exporting df
to Excel?
If you'd like to distinguish between different null types, your best bet is to replace values before exporting to Excel. Converting to a string is one way to make sure you are not conflating None, np.NaN, pd.NaT, etc...
df1 = pd.DataFrame({'b':None,'c':''},index=[0])
df2 = pd.DataFrame({'a':1,'b':1,'c':None},index=[0])
df = pd.concat([df1,df2], axis=0, sort=True)
null_map = {'nan': '-',
'None': '?'} # Add the string representation for other types you may need
df = df.applymap(lambda x:
null_map[str(x)]
if str(x) in null_map
else x)
df.to_excel('nan_none_empty2.xlsx')