Basic decision-making logic I managed to complete but, ironically, struggling with something very basic. 80% of cases my code is catching but asking help with the remaining 20%. Not even sure if this called branching or simply a decision tree, but it is beginners' stuff.
Small sample of my data:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Part ID' : [ 'Power Cord', 'Cat5 cable', 'Laptop', 'Hard Disk', 'Laptop Case', 'USB drive'],
'Part Serial Number' : [111222, 999444, 888333, 141417, np.NaN, 222666],
'Mother s/n': [100111, 200112, 888333, 888333, 888333, np.NaN],
})
df['Part Serial Number'] = df['Part Serial Number'].astype('Int64')
df['Mother s/n'] = df['Mother s/n'].astype('Int64')
df
This is my code:
df['Is mother s/n known?'] = np.where(df['Mother s/n'].isin(df['Part Serial Number']), 'Yes', 'No')
df
and it gives following output:
As you can see in the image, some of the results should be different. How to branch my code with Pandas, to achieve it, please?
You can use select
to choose between multiple conditions (not just between two as in where
):
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Part ID' : [ 'Power Cord', 'Cat5 cable', 'Laptop', 'Hard Disk', 'Laptop Case', 'USB drive'],
'Part Serial Number' : [111222, 999444, 888333, 141417, np.NaN, 222666],
'Mother s/n': [100111, 200112, 888333, 888333, 888333, np.NaN],
})
df['Part Serial Number'] = df['Part Serial Number'].astype('Int64')
df['Mother s/n'] = df['Mother s/n'].astype('Int64')
conditions = [df['Mother s/n'].eq(df['Part Serial Number']).fillna(False).astype(bool),
df['Mother s/n'].fillna(-1).isin(df['Part Serial Number']),
df['Mother s/n'].isna()]
choices = ['Self', 'Yes', 'Mother s/n unknown']
df['Is mother s/n known?'] = np.select(conditions, choices, 'No')
Result:
Part ID Part Serial Number Mother s/n Is mother s/n known?
0 Power Cord 111222 100111 No
1 Cat5 cable 999444 200112 No
2 Laptop 888333 888333 Self
3 Hard Disk 141417 888333 Yes
4 Laptop Case <NA> 888333 Yes
5 USB drive 222666 <NA> Mother s/n unknown