Search code examples
pythonpandasdataframenested-if

Is there a more pythonic way to nest conditional statements for filling a new column in a pandas df?


I am new to pandas and python, I searched but couldn't find exactly my problem. I am trying to find the best way to fill a new column in a pandas data frame 'Sample Location', based on the contents of another column, 'NO', to bin them into defined collections.

The first problem is:

        if TestLocation == 'LH Duct': 
            df['Sample Location'] = df.apply(
                lambda x: samplePoint(x['NO']),
                axis=1
            )    

I am not sure is formed correctly as my dataframe is getting kind of jumbled up.

Second question - Is there a more pythonic way of doing this check:

def samplePoint(n):
    if n <= 15:
        v = 'P1 S1'
    elif n >= 20 & n <= 35:
        v = 'P1 S2'
    elif n >= 40 & n <= 55:
        v = 'P1 S3'
    elif n >= 60 & n <= 75:
        v = 'P1 S4'
    elif n >= 80 & n <= 95:
        v = 'P1 S5'
    elif n >= 100 & n <= 115:
        v = 'P1 S6'
    elif n >= 150 & n <= 165:
        v = 'P2 S1'
    elif n >= 170 & n <= 185:
        v = 'P2 S2'
    elif n >= 190 & n <= 205:
        v = 'P2 S3'
    elif n >= 210 & n <= 225:
        v = 'P2 S4'
    elif n >= 230 & n <= 245:
        v = 'P2 S5'
    elif n >= 250 & n <= 265:
        v = 'P2 S6'
    else:
        v = 'null'
    return v

I thought the whole thing could/should be done as an apply/lambda but I got a little lost. If someone could explain this or send me a good link I would be eternally grateful!


Solution

  • Possibly the value of v_code can be calculated, otherwise I would put the options in a list of dicts and I write the function samplePoint as follows:

    samples = [
        {'range': (0, 15),
         'v_code': 'P1 S1'},
        {'range': (20, 35),
         'v_code': 'P1 S3'},
        {'range': (60, 75),
         'v_code': 'P1 S4'},
        {'range': (80, 95),
         'v_code': 'P1 S5'},
        {'range': (100, 115),
         'v_code': 'P1 S6'},
        {'range': (150, 165),
         'v_code': 'P2 S1'},
        {'range': (170, 185),
         'v_code': 'P2 S2'},
        {'range': (190, 205),
         'v_code': 'P2 S3'},
        {'range': (210, 225),
         'v_code': 'P2 S4'},
        {'range': (230, 245),
         'v_code': 'P2 S5'},
        {'range': (250, 265),
         'v_code': 'P2 S6'},
    ]
    
    
    def samplepoint(n):
        for sample in samples:
            if sample['range'][0] <= n <= sample['range'][1]:
                return sample['v_code']
    
        return 'null'
    
    if __name__ == '__main__':
        print(samplepoint(10))
    

    also renaming samplePoint to samplepoint as per Python naming convention. To make the module less cluttered you can import the list samples from a config file where you keep all your constants and settings. Thus

    from my_config import samples
    
    def samplepoint(n):
        for sample in samples:
            if sample['range'][0] <= n <= sample['range'][1]:
                return sample['v_code']
    
        return 'null'
    
    if __name__ == '__main__':
        print(samplepoint(100))
    

    where the file my_config.py is

    samples = [
        {'range': (0, 15),
         'v_code': 'P1 S1'},
        {'range': (20, 35),
         'v_code': 'P1 S3'},
        {'range': (60, 75),
         'v_code': 'P1 S4'},
        {'range': (80, 95),
         'v_code': 'P1 S5'},
        {'range': (100, 115),
         'v_code': 'P1 S6'},
        {'range': (150, 165),
         'v_code': 'P2 S1'},
        {'range': (170, 185),
         'v_code': 'P2 S2'},
        {'range': (190, 205),
         'v_code': 'P2 S3'},
        {'range': (210, 225),
         'v_code': 'P2 S4'},
        {'range': (230, 245),
         'v_code': 'P2 S5'},
        {'range': (250, 265),
         'v_code': 'P2 S6'},
    ]