sample_file.txt
6|test|3|4
5|test||8
9|test|NA|12
Script
import pandas as pd
df = pd.read_csv('sample_file.txt', dtype='str', sep='|', names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'])
df["chr_vl"].fillna("NOT AVLBL", inplace = True)
print(df)
Current output
upc_cd chr_typ chr_vl chr_vl_typ
0 6 test 3 4
1 5 test NOT AVLBL 8
2 9 test NOT AVLBL 12
Required output
upc_cd chr_typ chr_vl chr_vl_typ
0 6 test 3 4
1 5 test NOT AVLBL 8
2 9 test NA 12
Basically I need NA as it is in the output same time it should replace null values with the specific text 'NOT AVLBL' Tried replace method as well, but couldn't get the desired output
Pandas read_csv functiomn already defines a set of strings that will be interpreted as NaNs when you load a csv file. Here you have the option to either extend that list with other strings or to also completely overwrite it. In your case you have to overwrite it, as NA is one of the default values used by pandas. To do so, you could try something like
df = pd.read_csv('sample_file.txt', dtype='str', sep='|',
names=['upc_cd', 'chr_typ', 'chr_vl','chr_vl_typ'],
na_values=[''], keep_default_na=False)
...
This will only interpret the empty string as NA as we have set keep_default_na
to False
and have only given ''
as a NA value with na_values
argument.
If you want to learn more, have a look at the pandas docs.