Search code examples
pythonpandasone-hot-encoding

Replace 1 value in a subsection of dataframe with value of another column


I have a one hot encoded dataframe such as:

|  qtd|   a |   b |  c |   d |  e | ...z|
|-----+-----+-----|----|----+-----+-----|
|  90 |   1 |   0 |  0 |   0 |  0 |   0 |
|  10 |   0 |   0 |  0 |   0 |  0 |   1 |
|  40 |   0 |   1 |  0 |   0 |  0 |   0 |
|  80 |   0 |   0 |  1 |   0 |  0 |   0 |
|  90 |   0 |   1 |  0 |   0 |  0 |   0 |

I want to replace the values of the columns a to infinite with the value in qtd where the column has 1 as value, there is only one 1 value in the a to infinite dataframe subframe.

Such as:

|  qtd|   a |   b |  c |   d |  e | ...z|
|-----+-----+-----|----|----+-----+-----|
|  90 |  90 |   0 |  0 |   0 |  0 |   0 |
|  10 |   0 |   0 |  0 |   0 |  0 |  10 |
|  40 |   0 |  40 |  0 |   0 |  0 |   0 |
|  80 |   0 |   0 | 80 |   0 |  0 |   0 |
|  90 |   0 |  90 |  0 |   0 |  0 |   0 |

Solution

  • You can select all columns without first by DataFrame.iloc and multiple by column with DataFrame.mul:

    df.iloc[:, 1:] = df.iloc[:, 1:].mul(df['qtd'], axis=0)
    print (df)
       qtd   a   b   c  d  e   z
    0   90  90   0   0  0  0   0
    1   10   0   0   0  0  0  10
    2   40   0  40   0  0  0   0
    3   80   0   0  80  0  0   0
    4   90   0  90   0  0  0   0
    

    If column is not always first is possible get columns names by Index.difference and select by subset:

    cols = df.columns.difference(['qtd'])
    df[cols] = df[cols].mul(df['qtd'], axis=0)
    

    If first column is index:

    df = df.mul(df.index, axis=0)
    print (df)
          a   b   c  d  e   z
    qtd                      
    90   90   0   0  0  0   0
    10    0   0   0  0  0  10
    40    0  40   0  0  0   0
    80    0   0  80  0  0   0
    90    0  90   0  0  0   0