Search code examples
pythonpandasdataframeimputation

How can all values of certain included or excluded columns of a DataFrame be impuded based on a condition?


Let's say I have a simple DataFrame:

import pandas as pd

df = pd.DataFrame.from_dict(
        {
            'foo': [0.00, 0.31, 0.45],
            'bar': [1.00, 0.55, 3.01],
            'qux': [0.30, 4.10, 2.78]
        },
        orient = 'index'
     )

Here it is:

       0     1     2
qux  0.3  4.10  2.78
foo  0.0  0.31  0.45
bar  1.0  0.55  3.01

I can change all values less than 1 in the DataFrame to some other value (0) in this way:

df[df < 1] = 0

This results in this:

       0    1     2
qux  0.0  4.1  2.78
foo  0.0  0.0  0.00
bar  1.0  0.0  3.01

How could I apply such a change to all columns except, say, column 2? This would result in the following:

       0    1     2
qux  0.0  4.1  2.78
foo  0.0  0.0  0.45
bar  1.0  0.0  3.01

Solution

  • It's possible to have fewer columns as of the boolean indexing, so you may drop column 2 when constructing the boolean criteria:

    df[df.drop(2, axis=1) < 1] = 0
    
    df
    #         0   1    2
    #foo    0.0 0.0 0.45
    #qux    0.0 4.1 2.78
    #bar    1.0 0.0 3.01
    
    df[df.drop(1, axis=1) < 1] = 0
    
    df
    #         0    1       2
    #foo    0.0 0.31    0.00
    #qux    0.0 4.10    2.78
    #bar    1.0 0.55    3.01