Search code examples
pythonpandasfillna

How can I make broadly applicable code that fills missing elements differently according to the variable type?


I am supposed to fill missing values of a lot of CSV files. Normally, those have almost the same variables.

Here are the conditions that I should satisfy.

  1. If the value type is numeric I should fill -1 to the missing value.
  2. If the value type is character I should fill m to the missing value.

The problem is that Each CSV file has different variables in detail. For example, Data_1 is

v1 v2 v3 v4 v5
1  a  d  1   
2  b     1   4
   d  a  1   6
2     d  1

then it should be

v1 v2 v3 v4 v5
1  a  d  1  -1
2  b  m  1   4
-1 d  a  1   6
2  m  d  1  -1

However, each data is different in that,

v1 v2 v3 v5
1  a  d   
2  b     4
   d  a  6
2     d  

or

v5 v6
    x
 4  y
 6  
    d

Therefore, I want to generate code that can uniformly apply to many CSVs that have characteristics above. I tried fillna for example,

x = x.fillna(-1)
y = y.fillna(m)

Solution

  • From the description you posted, this function might help:

    def filldefault(series):
      series.fillna('m' if type(series.iloc[0]) == str else -1, inplace = True)
    

    Perhaps you can iterate something like that over the columns in your dataframe.