I am new to pandas and python, I searched but couldn't find exactly my problem. I am trying to find the best way to fill a new column in a pandas data frame 'Sample Location', based on the contents of another column, 'NO', to bin them into defined collections.
The first problem is:
if TestLocation == 'LH Duct':
df['Sample Location'] = df.apply(
lambda x: samplePoint(x['NO']),
axis=1
)
I am not sure is formed correctly as my dataframe is getting kind of jumbled up.
Second question - Is there a more pythonic way of doing this check:
def samplePoint(n):
if n <= 15:
v = 'P1 S1'
elif n >= 20 & n <= 35:
v = 'P1 S2'
elif n >= 40 & n <= 55:
v = 'P1 S3'
elif n >= 60 & n <= 75:
v = 'P1 S4'
elif n >= 80 & n <= 95:
v = 'P1 S5'
elif n >= 100 & n <= 115:
v = 'P1 S6'
elif n >= 150 & n <= 165:
v = 'P2 S1'
elif n >= 170 & n <= 185:
v = 'P2 S2'
elif n >= 190 & n <= 205:
v = 'P2 S3'
elif n >= 210 & n <= 225:
v = 'P2 S4'
elif n >= 230 & n <= 245:
v = 'P2 S5'
elif n >= 250 & n <= 265:
v = 'P2 S6'
else:
v = 'null'
return v
I thought the whole thing could/should be done as an apply/lambda but I got a little lost. If someone could explain this or send me a good link I would be eternally grateful!
Possibly the value of v_code can be calculated, otherwise I would put the options in a list of dicts and I write the function samplePoint
as follows:
samples = [
{'range': (0, 15),
'v_code': 'P1 S1'},
{'range': (20, 35),
'v_code': 'P1 S3'},
{'range': (60, 75),
'v_code': 'P1 S4'},
{'range': (80, 95),
'v_code': 'P1 S5'},
{'range': (100, 115),
'v_code': 'P1 S6'},
{'range': (150, 165),
'v_code': 'P2 S1'},
{'range': (170, 185),
'v_code': 'P2 S2'},
{'range': (190, 205),
'v_code': 'P2 S3'},
{'range': (210, 225),
'v_code': 'P2 S4'},
{'range': (230, 245),
'v_code': 'P2 S5'},
{'range': (250, 265),
'v_code': 'P2 S6'},
]
def samplepoint(n):
for sample in samples:
if sample['range'][0] <= n <= sample['range'][1]:
return sample['v_code']
return 'null'
if __name__ == '__main__':
print(samplepoint(10))
also renaming samplePoint
to samplepoint
as per Python naming convention. To make the module less cluttered you can import the list samples
from a config file where you keep all your constants and settings. Thus
from my_config import samples
def samplepoint(n):
for sample in samples:
if sample['range'][0] <= n <= sample['range'][1]:
return sample['v_code']
return 'null'
if __name__ == '__main__':
print(samplepoint(100))
where the file my_config.py
is
samples = [
{'range': (0, 15),
'v_code': 'P1 S1'},
{'range': (20, 35),
'v_code': 'P1 S3'},
{'range': (60, 75),
'v_code': 'P1 S4'},
{'range': (80, 95),
'v_code': 'P1 S5'},
{'range': (100, 115),
'v_code': 'P1 S6'},
{'range': (150, 165),
'v_code': 'P2 S1'},
{'range': (170, 185),
'v_code': 'P2 S2'},
{'range': (190, 205),
'v_code': 'P2 S3'},
{'range': (210, 225),
'v_code': 'P2 S4'},
{'range': (230, 245),
'v_code': 'P2 S5'},
{'range': (250, 265),
'v_code': 'P2 S6'},
]