Search code examples
pythonpandasnumpyreplaceseries

Pandas replace by condition in the 1st row


I have a table:

Object Col1 Col2 Col3 Col4
reference 10 0 7 0
Obj1 14 9 1 30
Obj2 1 16 0 0
Obj3 9 21 3 17
Obj4 11 0 4 22

And I want to transform it by condition: if 1st row of any column is =0, then other rows of this column can be either 0 or 1. If the current value is >0, than it becomes 1, if it's 0, than it remains 0.

Resulting table is:

Object Col1 Col2 Col3 Col4
reference 10 0 7 0
Obj1 14 1 1 1
Obj2 1 1 0 0
Obj3 9 1 3 1
Obj4 11 0 4 1

I've tried this variant:

for column in df:
    df[column] = np.where((df[column][0] == 0) & (df[column] != 0), 1, 0)

but it resets to zero all columns, where 1st row isn't equal to 0. How can I do this right?


Solution

  • You can select first row by df.iloc[0] and then check which columns in this row are equal 0. Then change the particular columns:

    first_row = df.iloc[0]
    zero_columns = first_row[first_row == 0].index
    
    df.loc[:, zero_columns] = (df.loc[:, zero_columns] > 0).astype(int)
    print(df)
    

    Prints:

          Object  Col1  Col2  Col3  Col4
    0  reference    10     0     7     0
    1       Obj1    14     1     1     1
    2       Obj2     1     1     0     0
    3       Obj3     9     1     3     1
    4       Obj4    11     0     4     1