Search code examples
pythonpandasmultiplication

Pandas inplace conditional value multiplication


I have tried all solutions I could find on the topic, all of them didn't apply to the dataframe "inplace" and the multiplication never happened.

So here is what I am trying to do: I have a multilevel column dataframe with many measurements. Every column is ordered like this:

data:

                   MeasurementType
                   Value Unit Type
    StudyNumber 
             1     1.0   m/s   a
             2     1.7   m/s   v
             3     10.5  cm/s  b

I am trying to convert all measurements with unit m/s to cm/s, i.e. I need to filter all Values with Unit m/s, multiply them by 10 and then change the Unit in the Unit column.

I managed the filter, however when I perform a multiplication on it (by *10, .mul(10) directly, or making a new assignment), it doesn't stick. Printing the dataframe afterwards shows no change in the values.

Here is the code:

    unit_df = data.iloc[:, data.columns.get_level_values(1)=='Unit']


    unit_col_list = []
    for unitcol in unit_df.columns:
        unitget = unit_df[unitcol][unit_df[unitcol].notnull()].unique()
        if unitget.size > 1:
            unit_col_list.append(unitcol)


    unit_col_list =  [item[0] for item in unit_col_list] #so I get the header of the column 
    data_wrongunits = data[unit_col_list]


data_wrongunits[unit_col_list[0]][data_wrongunits[unit_col_list[0]]['Unit'] == 'm/s']['Value']*=10

or

data_wrongunits[unit_col_list[0]][data_wrongunits[unit_col_list[0]]['Unit'] == 'm/s']['Value'].mul(10)

or

data_wrongunits[unit_col_list[0]][data_wrongunits[unit_col_list[0]]['Unit'] == 'm/s']['Value']=data_wrongunits[unit_col_list[0]][data_wrongunits[unit_col_list[0]]['Unit'] == 'm/s']['Value']*10

The filter gives me a series of the Value column. Maybe another structure would help?


Solution

  • You can use:

    print (data)
                MeasurementType            MeasurementType1           
                          Value  Unit Type            Value  Unit Type
    StudyNumber                                                       
    1                       1.0   m/s    a              1.0   m/s    a
    2                       1.7   m/s    v              1.7  cm/s    v
    3                      10.5  cm/s    b             10.5  mm/s    b
    
    #get columns with Unit
    unit_df = data.loc[:, data.columns.get_level_values(1)=='Unit']
    print (unit_df)
                MeasurementType MeasurementType1
                           Unit             Unit
    StudyNumber                                 
    1                       m/s              m/s
    2                       m/s             cm/s
    3                      cm/s             mm/s
    
    #create helper df with replace units by constants
    #if value not in dict, get NaNs, so replaced by 1
    d = {'m/s':10, 'mm/s':100}
    df1 = unit_df.applymap(d.get).fillna(1).rename(columns={'Unit':'Value'})
    print (df1)
                MeasurementType MeasurementType1
                          Value            Value
    StudyNumber                                 
    1                      10.0             10.0
    2                      10.0              1.0
    3                       1.0            100.0
    

    #filter only Value columns and multiple by df1
    data[df1.columns] = data[df1.columns].mul(df1)
    print (data)
                MeasurementType            MeasurementType1           
                          Value  Unit Type            Value  Unit Type
    StudyNumber                                                       
    1                      10.0   m/s    a             10.0   m/s    a
    2                      17.0   m/s    v              1.7  cm/s    v
    3                      10.5  cm/s    b           1050.0  mm/s    b