I have a list of intervals (time windows) in which machines are operating, e.g. 0-14, 22-38, 46-62, etc. in hours.
The different operations I have, have various durations, eg. 1 hour, 5 hours, 7 hours, 32 hours, etc.
What I want to do, is for each available hour, to tell what the duration may be and return all the unique values. The operations are allowed to exceed the interval, but if it does it must continue in the next available interval and the idle time must be added:
Example:
Time windows: [[1-4], [9-13], [15-21], ...]
Duration: 4
hours
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
4 8 8 8 x x x x 4 4 5 5 5 x 4 4 4 4 x x x
(x indicates that the process cannot start in this time period (machine idle time))
Return: [4, 5, 8]
EDIT: I've tried to extract the hours that are eligible with a duration of 4 and identified those that exceed the interval, but I'm not sure how to add the time... Another issue I can see is what I do if it not only passes the first interval, but also the next? e.g. with a duration of 32
hours...
duration = 4
unique_values = [duration]
for i in range(max(interval_list)[1]): #Get max hour
for j in interval_list: #For each interval
if i >= j[0] and i <= j[1]: #If hour is in interval
if i + duration <= j[1]: #If hour + duration <= interval upperbound
unique_values.append(duration) #Add duration
else: #Else if duration exceed interval... :-(
pass
else:
pass
The following seems to work:
First define your timewindows and a start and end time for your dataframe:
timewindows = [[1,4], [9,13], [15,21]]
start, end = timewindows[0][0], timewindows[-1][-1]
Next let's construct a DataFrame containing the hours spanning your timewindows and whether they are "available" or not:
run_df = pd.DataFrame({"hour": range(start, end + 1)})
run_df.loc[:, 'available'] = run_df['hour'].apply(lambda x: any([x in y for y in [list(range(x[0], x[1] + 1)) for x in timewindows]]))
run_df = run_df.set_index("hour")
run_df.head(10)
Next, define the function:
def duration_func(df, duration):
true_df = df[df['available'] == True]
for i, row in true_df.iterrows():
try:
earliest_available = true_df.loc[(i - 1 + duration):].iloc[0].name
idle_time = (~df.loc[i:earliest_available]['available']).sum()
finish = duration + idle_time
except:
finish = np.nan
df.loc[i, 'finish'] = finish
return df
run_df = duration_func(run_df, duration = 4)
run_df
Hope this helps!