I have this dataframe:
Env location lob grid row server model make slot
Prod USA Market AB3 bc2 Server123 Hitachi dcs 1
Prod USA Market AB3 bc2 Server123 Hitachi dcs 2
Prod USA Market AB3 bc2 Server123 Hitachi dcs 3
Prod USA Market AB3 bc2 Server123 Hitachi dcs 4
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 3
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 3
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 3
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 4
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 5
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 5
Dev EMEA Ins AB6 bc4 Serverabc IBM abc 6
UAT PAC Retail AB6 bc4 Serverzzz Cisco ust 3
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 4
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 5
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 6
In this example:
New rows:
This needs to be done programmatically where given the model, I need to know the total number of slots and if the number of slots is short, I need to create new rows.
The final dataframe needs to be like this:
Env location lob grid row server model make slot
Prod USA Market AB3 bc2 Server123 Hitachi dcs 1
Prod USA Market AB3 bc2 Server123 Hitachi dcs 2
Prod USA Market AB3 bc2 Server123 Hitachi dcs 3
Prod USA Market AB3 bc2 Server123 Hitachi. dcs 4
Dev EMEA Ins. AB6 bc4 Serverabc IBM abc 3
Dev EMEA Ins. AB6 bc4 Serverabc IBM abc 4
Dev EMEA Ins. AB6 bc4 Serverabc IBM abc 5
Dev EMEA Ins. AB6 bc4 Serverabc IBM abc 6
Dev EMEA Ins. available bc4 Serverabc IBM abc 7
Dev EMEA Ins. available bc4 Serverabc IBM abc 8
Dev EMEA Ins. available bc4 Serverabc IBM abc 9
Dev EMEA Ins. available bc4 Serverabc IBM abc 10
UAT PAC Retail AB6 bc4 Serverzzz Cisco ust 3
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 4
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 5
UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 6
UAT PAC Retail available bc4 Serverzzz Cisco ust 7
UAT PAC Retail available bc4 Serverzzz Cisco ust 8
I tried something like this:
def slots(row):
if 'IBM' in row['model']:
number_row=8
if 'Cisco' in row['model']:
number_row=6
How do I do this?
I would use groupby.apply
to add new rows to each model. The general workflow is as follows.
Remove duplicate slots from each model.
Group the dataframe by the 'model'
column.
For each model, do anything at all only if
a. it is either IBM or Cisco (identified by whether it is a key in N_slots
dictionary)
b. the number of slots does not reach the required number of slots (the values in N_slots
dictionary)
If (3) is satisfied, then use reindex()
to add new empty rows.
Assign slot values to the 'slots'
column (e.g. For IBM, it's 3-10)
Fill the newly created empty rows in the grid
column by 'available'.
Fill all other newly created rows by values one row above (ffill()
).
Reset index to remove duplicate indices.
def add_slots(s):
# get model name
model = s['model'].iat[0]
# get how many slots there should be for this model
slots = N_slots.get(model, 0)
# where to start reindexing
start = s.index[0]
low = s['slot'].dropna().astype(int).min()
# for pandas>=1.1, remove the previous line and uncomment the next line
# low = s['slot'].min()
if len(s) < slots:
# add new indices
s = s.reindex(range(start, start + slots))
# assign slots
s['slot'] = range(low, low + slots)
# assign grids at newly created slots to be 'available'
s['grid'] = s['grid'].fillna('available')
return s
N_slots = {'IBM': 8, 'Cisco': 6}
new_df = (
df.drop_duplicates(['server', 'model', 'slot'], ignore_index=True) # remove duplicate slots
.groupby('model', sort=False, group_keys=False).apply(add_slots) # add new slots
.ffill() # fill rest of the columns
.sort_values(by=['server', 'slot'])
.reset_index(drop=True) # reset index
)
Env location lob grid row server model make slot
0 Prod USA Market AB3 bc2 Server123 Hitachi dcs 1
1 Prod USA Market AB3 bc2 Server123 Hitachi dcs 2
2 Prod USA Market AB3 bc2 Server123 Hitachi dcs 3
3 Prod USA Market AB3 bc2 Server123 Hitachi dcs 4
4 Dev EMEA Ins AB6 bc4 Serverabc IBM abc 3
5 Dev EMEA Ins AB6 bc4 Serverabc IBM abc 4
6 Dev EMEA Ins AB6 bc4 Serverabc IBM abc 5
7 Dev EMEA Ins AB6 bc4 Serverabc IBM abc 6
8 Dev EMEA Ins available bc4 Serverabc IBM abc 7
9 Dev EMEA Ins available bc4 Serverabc IBM abc 8
10 Dev EMEA Ins available bc4 Serverabc IBM abc 9
11 Dev EMEA Ins available bc4 Serverabc IBM abc 10
12 UAT PAC Retail AB6 bc4 Serverzzz Cisco ust 3
13 UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 4
14 UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 5
15 UAT PAC Retail BB6 bc4 Serverzzz Cisco ust 6
16 UAT PAC Retail available bc4 Serverzzz Cisco ust 7
17 UAT PAC Retail available bc4 Serverzzz Cisco ust 8