Search code examples
pythonpandasfillna

Pandas fillna exception for 'NA' string


sample_file.txt

6|test|3|4
5|test||8
9|test|NA|12

Script

import pandas as pd
df = pd.read_csv('sample_file.txt', dtype='str', sep='|', names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'])
df["chr_vl"].fillna("NOT AVLBL", inplace = True)
print(df)

Current output

upc_cd chr_typ     chr_vl chr_vl_typ
0      6    test          3          4
1      5    test  NOT AVLBL          8
2      9    test  NOT AVLBL         12

Required output

upc_cd chr_typ     chr_vl chr_vl_typ
0      6    test          3          4
1      5    test  NOT AVLBL          8
2      9    test         NA         12

Basically I need NA as it is in the output same time it should replace null values with the specific text 'NOT AVLBL' Tried replace method as well, but couldn't get the desired output


Solution

  • Pandas read_csv functiomn already defines a set of strings that will be interpreted as NaNs when you load a csv file. Here you have the option to either extend that list with other strings or to also completely overwrite it. In your case you have to overwrite it, as NA is one of the default values used by pandas. To do so, you could try something like

    df = pd.read_csv('sample_file.txt', dtype='str', sep='|',
                     names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'],
                     na_values=[''], keep_default_na=False)
    ...
    

    This will only interpret the empty string as NA as we have set keep_default_na to False and have only given '' as a NA value with na_values argument. If you want to learn more, have a look at the pandas docs.