Search code examples
python-3.xdataframescikit-learnnanimputation

Python: Dealing with NaN Values using Imputer on Dataframe index wise


I have a data with some NaN values and i want to fill the NaN values using imputer.

from sklearn.preprocessing import Imputer 
imp = Imputer(missing_values='NaN', strategy='mean', axis=1) 
cleaned_data = imp.fit_transform(original_data)

so far I known imputer works on entire column Like this:

            Point1        Point2
S.No
             2              NaN
1            NaN            4
             2              NaN
             NaN            4
2            2              NaN
             NaN            4

After applying imputer the data looks like:

            Point1        Point2
S.No
             2              2
1            1              4
             2              2
             1              4
2            2              2
             1              4

but I want imputer works index wise name as S.No

            Point1        Point2
S.No
             2              1.33
1            1.333          4
             2              1.33
             0.667          4
2            2              2.667
             0.667          4

It is possible to implement imputer like this or not OR there are any alternative methods exist to do like this in python on DataFrame.


Solution

  • imp = Imputer(missing_values=np.NaN,strategy='mean',axis=1)
    for S.No in range (start,end):
        for col in list(Data.select_dtypes(include=['float']).columns):
            Data[col][S.No] = imp.fit_transform(Data[col][S.No])