Following are the standard values for score and grades. Score column contains numbers within specified range.
Score | Grade1 | Score2 | Grade3 | Grade4 |
---|---|---|---|---|
<260 | A1 | <200 | D1 | ID1 |
260-275 | A2 | 200-250 | D2 | ID2 |
275-280 | A3 | 250-260 | D3 | ID3 |
280-285 | A4 | 260-270 | D4 | ID4 |
>=285 | A5 | >=270 | D5 | ID5 |
Input :-
Score | Grade1 | Score2 | Grade3 | Grade4 |
---|---|---|---|---|
290 | A1 | 265 | D3 | ID2 |
Output:-
Tiers | Grade | Grade1 | Score2 | Grade3 | Grade4 |
---|---|---|---|---|---|
1 | A1 | ||||
2 | ID2 | ||||
3 | D3 | ||||
4 | 265 | ||||
5 | 290 |
Input :-
Score | Grade1 | Score2 | Grade3 | Grade4 |
---|---|---|---|---|
290 | A1 | 265 | D3 | ID2 |
Assuming df
and df_input
your two DataFrames, you can use a combination of pandas and numpy:
# get df_input as Series
s = df_input.loc[0]
# find identical values
mask = df.eq(s)
# match ranges
def match_range(val, ref):
# extract the upper value, for the last replace by np.inf
s = pd.to_numeric(ref.str.extract('[<-](\d+)', expand=False)).fillna(np.inf)
# define default output as False
out = np.zeros(len(ref), dtype=bool)
# find matching position
out[np.searchsorted(s, val)] = True
return out
# apply match_range on columns for which a direct match failed
m = ~mask.any()
mask.loc[:, m] = df.loc[:, m].apply(lambda x: match_range(s[x.name], x))
# generate output
out = (pd.DataFrame(np.where(mask, s, ''),
index=np.arange(len(df))+1,
columns=df.columns)
.rename_axis('Tiers').reset_index()
)
Output:
Tiers Score Grade1 Score2 Grade3 Grade4
0 1 A1
1 2 ID2
2 3 D3
3 4 265
4 5 290
Used inputs:
df = pd.DataFrame({'Score': ['<260', '260-275', '275-280', '280-285', '>=285'],
'Grade1': ['A1', 'A2', 'A3', 'A4', 'A5'],
'Score2': ['<200', '200-250', '250-260', '260-270', '>=270'],
'Grade3': ['D1', 'D2', 'D3', 'D4', 'D5'],
'Grade4': ['ID1', 'ID2', 'ID3', 'ID4', 'ID5']})
df_input = pd.DataFrame({'Score': [290], 'Grade1': ['A1'], 'Score2': [265], 'Grade3': ['D3'], 'Grade4': ['ID2']})