Search code examples
python-3.xpandasscikit-learnsklearn-pandas

IF else and for loop in one line


I need to apply if else condition and for loop in single line.I need to update both 'RL' and "RM" at a time and update other values as 'Others'.How to do it??.IS it possible??

train['MSZoning']=['RL' if x=='RL' else 'Others' for x in train['MSZoning']]

Solution

  • Use numpy.where:

    train['MSZoning'] = np.where(train['MSZoning'] == 'RM', 'RM', 'Others')
    

    If need update all without RM and RL use isin with inverted boolean mask by ~:

    train = pd.DataFrame({'MSZoning':['RL'] *3 + ['qa','RM','as']})
    train.loc[~train['MSZoning'].isin(['RM','RL']), 'MSZoning'] =  'Others'
    
    print (train)
      MSZoning
    0       RL
    1       RL
    2       RL
    3   Others
    4       RM
    5   Others
    

    Timings:

    train = pd.DataFrame({'MSZoning':['RL'] *3 + ['qa','RM','as']})
    #[60000 rows x 1 columns]
    train = pd.concat([train] * 10000, ignore_index=True)
    
    In [202]: %timeit train.loc[~train['MSZoning'].isin(['RM','RL']), 'MSZoning'] =  'Others'
    5.82 ms ± 447 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    In [203]: %timeit train['MSZoning'] = train['MSZoning'].apply(lambda x: x if x in ('RM', 'RL') else 'Others')
    15 ms ± 584 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)