Search code examples
machine-learningdata-cleaningfeature-engineering

How to Remove Minimum Value from a Data frame


I'm a novice in Data Science, but I think I have a doubt here. I have attached the image below.

Image of Described one

In the Dataframe above in the Columns ===> the Humidity, Wind Speed (km/h), Visibility (km), and Pressure (millibars), there are min values which are zero. I don't feel that they should be zero, if am right how can I replace those min values with the next min value using pandas in my data frame. Or please correct me.


Solution

  • If I understand your question correctly, following will do the trick:

    cols = ['Humidity','Wind Speed (km/h)', 'Visibility (km)', 'Pressure (millibars)']
    df_train.loc[:, cols] = df_train.loc[:,cols].apply(lambda X: applymin(X))
    

    where applymin(X)

    def applymin(X):
        X[X==X.min()] = sorted(X.unique().tolist())[1]
        return X
    

    i.e. wherever a column is minimum, we replace it by 2nd smallest value.