Search code examples
pandasazure-machine-learning-service

Pandas dataframe and apply - Can't figure out why resulting values are negative


Here's a picture of my data, the column of interest RUL is on the far right the names got cut off (I'm using the Turbo Engine Degradation dataset from NASA) can be found here: https://data.nasa.gov/widgets/vrks-gjie

I'm doing this in Azure ML Studio but code snippet below, I have 2 helper functions get_engine_last_cycle (which when I unit test it seems to do as expected - compute the last cycle for that engine, for example engine 2 has a max cycle in this dataset of 287 when it fails). The final helper function I call get_engine_remainig_life, takes the engine and cycle as arguments and returns the max cycle - current cycle for that engine (again I've unit tested this and it seems to give me expected results).

For some reason this isn't working when I run my notebook. The column which I call "RUL" should return a sequence of decreasing, positive integers for example 287, 286, 285 284, etc for engine #2. However, it's giving me negative values. I can't seem to figure out why but know the problem is likely with this one piece of code

 df['RUL'] = df[['engine', 'cycle']].apply(lambda x: get_engine_remaining_life(*x), axis=1)

enter image description here

    def get_engine_last_cycle(engine):
        return int(df.loc[engine, ['cycle']].max())


    def get_engine_remaining_life(engine, cycle):
        return get_engine_last_cycle(engine) - int(cycle)

    df['RUL'] = df[['engine', 'cycle']].apply(lambda x: get_engine_remaining_life(*x), axis=1)

    return df

Solution

  • Just for the sake of trying, this is how I'd implement this. Maybe it will help you.

    df['RUL'] = df.loc[:, ['engine', 'cycle']].groupby('engine').transform('max')
    df['RUL'] = df['RUL'] - df['cycle']