I'm trying to generate new rows based on values in a certain column. In current data as you can see 'days_left' column does not have all sequential values.
current = {'assignment': [1,1,1,1,2,2,2,2,2], 'days_left': [1, 2, 5, 9,1, 3, 4, 8, 13]}
dfcurrent = pd.DataFrame(data=current)
dfcurrent
While I want to generate rows into that dataframe to create make sequential list for for 'days_left' for each 'assignment'. Please see the desidered output below:
desired = {'assignment': [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2],
'days_left': [1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,10,11,12,13]}
dfdesired = pd.DataFrame(data=desired)
dfdesired
Note: The original data is much bigger and has other columns as well but I just simplified it for this question.
Could you please help me how I can solve this?
Thank you very much in advance!
You can iterate through the rows of the current dataframe and create a new dataframe. For each days_left range, copy the current row to the new dataframe and update the days_left column value.
Try this code:
import pandas as pd
current = {'assignment': [1,1,1,1,2,2,2,2,2], 'days_left': [1, 2, 5, 9, 1, 3, 4, 8, 13]}
dfc = pd.DataFrame(data=current)
dfd = pd.DataFrame() # new dataframe
for r in range(1,len(dfc)): # start at 2nd row
for i in range(dfc.iloc[r-1]['days_left'],dfc.iloc[r]['days_left']): # fill gap of missing numbers
dfd = dfd.append(dfc.iloc[r]) # copy row
dfd.reset_index(drop=True, inplace=True) # prevent index duplication
dfd.loc[len(dfd)-1, 'days_left'] = i # update column value
if r == len(dfc)-1 or dfc.iloc[r+1]['assignment']!=dfc.iloc[r]['assignment']: # last entry in assignment
dfd = dfd.append(dfc.iloc[r]) # copy row
dfd.reset_index(drop=True, inplace=True) # prevent index duplication
dfd = dfd.astype(int) # convert all data to integers
print(dfd.to_string(index=False))
Output
assignment days_left
1 1
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
2 1
2 2
2 3
2 4
2 5
2 6
2 7
2 8
2 9
2 10
2 11
2 12
2 13