I am trying to create consequential stages for the performance of a machine that is shutting down. There are different stages that this machine has to go through in the shut down cycle. The problem is that the machine can go back in the sequence for some stages. Based on the data you cannot distinguish all stages possible, because some show the same information but based on the timeline it can be determined where the machine is in the cycle.
I created a sample dataset to give an example of the data:
import pandas as pd
data = {
"Date and Time": ["2020-06-07 00:00", "2020-06-07 00:01", "2020-06-07 00:02", "2020-06-07 00:03", "2020-06-07 00:04", "2020-06-07 00:05", "2020-06-07 00:06", "2020-06-07 00:07", "2020-06-07 00:08", "2020-06-07 00:09", "2020-06-07 00:10", "2020-06-07 00:11", "2020-06-07 00:12", "2020-06-07 00:13", "2020-06-07 00:14", "2020-06-07 00:15", "2020-06-07 00:16", "2020-06-07 00:17", "2020-06-07 00:18", "2020-06-07 00:19", "2020-06-07 00:20", "2020-06-07 00:21", "2020-06-07 00:22", "2020-06-07 00:23", "2020-06-07 00:24", "2020-06-07 00:25", "2020-06-07 00:26", "2020-06-07 00:27", "2020-06-07 00:28", "2020-06-07 00:29"],
"Current": [16.2, 15.1, 13.8, 12.0, 11.9, 12.1, 10.8, 9.8, 8.3, 6.2, 4.3, 4.2, 4.2, 3.3, 1.8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
"Flow": [39.8, 40.3, 40.2, 40.1, 40.3, 39.8, 40.1, 40.2, 40.4, 39.6, 40, 39.3, 40.7, 38.9, 39.3, 0, 0, 39.3, 39.2, 0, 0, 38.9, 38.7, 0, 39.3, 39.2, 40.3, 0, 0, 0]
}
df = pd.DataFrame(data)
I already tried to distinguish between the phases with the following code:
# Calculate the difference between two datapoints regarding the current change
df['Current_ddt'] = ((df["Current"]) - (df["Current"].shift(1)))
# Determine which part of the shutdown the machine is in based on current and flow data
df.loc[(df["Current"] > 4.5) & (df["Current_ddt"] > -1), 'progress in shutdown cycle'] = 'Running'
df.loc[(df["Current"] > 4.5) & (df["Current_ddt"] <= -1), 'progress in shutdown cycle'] = 'Ramping down'
df.loc[(df["Current"] > 4) & (df["Current"] < 4.5) & (df["Current_ddt"] > -1), 'progress in shutdown cycle'] = 'Ramp down complete between 4-4.5'
df.loc[(df["Current"] < 4.5) & (df["Current"] != 0) & (df["Current_ddt"] < -1), 'progress in shutdown cycle'] = 'Shutdown' # Not possible to go back to an earlier stage
df.loc[(df["Current"] == 0) & (df["Flow"] == 0), 'progress in shutdown cycle'] = 'de-energized' # Not possible to go back to an earlier stage
df.loc[(df["Current"] == 0) & (df["Flow"] != 0), 'progress in shutdown cycle'] = 'flushing' #Ideally this could distinguish first, second and third flush
This part works ok until de-energized. Ultimately I would like to be able to distinguish between a normal rampdown (i.e. going to a lower production level) and a rampdown to 4.5 since I am only interested in the real shutdown of a machine since that is the time that most damage to the machine can be done if performed in the wrong way.
However, the part after de-energized is giving me the most problems. There are 3 flushing cycles, the first one is a general purge to empty the machine. The second and (optional) third flush are there to make sure the machine is clean and ready for maintenance. Based on the data there is no difference though, so I am thinking of a consequential way to distinguish between these but I do not know how to do it.
The idea output would be something like this:
Date and Time | Current | Flow | Current_ddt | Progress in shutdown cycle |
---|---|---|---|---|
2020-06-07 00:00 | 16.2 | 39.8 | ||
2020-06-07 00:01 | 15.1 | 40.3 | -1.1 | Ramping down |
2020-06-07 00:02 | 13.8 | 40.2 | -1.3 | Ramping down |
2020-06-07 00:03 | 12 | 40.1 | -1.8 | Ramping down |
2020-06-07 00:04 | 11.9 | 40.3 | -0.0999999999999996 | Running |
2020-06-07 00:05 | 12.1 | 39.8 | 0.199999999999999 | Running |
2020-06-07 00:06 | 10.8 | 40.1 | -1.3 | Ramping down |
2020-06-07 00:07 | 9.8 | 40.2 | -1 | Ramping down |
2020-06-07 00:08 | 8.3 | 40.4 | -1.5 | Ramping down |
2020-06-07 00:09 | 6.2 | 39.6 | -2.1 | Ramping down |
2020-06-07 00:10 | 4.3 | 40 | -1.9 | Shutdown |
2020-06-07 00:11 | 4.2 | 39.3 | -0.0999999999999996 | Ramp down complete between 4-4.5 |
2020-06-07 00:12 | 4.2 | 40.7 | 0 | Ramp down complete between 4-4.5 |
2020-06-07 00:13 | 3.3 | 38.9 | -0.9 | Shutdown |
2020-06-07 00:14 | 1.8 | 39.3 | -1.5 | Shutdown |
2020-06-07 00:15 | 0 | 0 | -1.8 | de-energized |
2020-06-07 00:16 | 0 | 0 | 0 | de-energized |
2020-06-07 00:17 | 0 | 39.3 | 0 | purging |
2020-06-07 00:18 | 0 | 39.2 | 0 | purging |
2020-06-07 00:19 | 0 | 0 | 0 | purged |
2020-06-07 00:20 | 0 | 0 | 0 | purged |
2020-06-07 00:21 | 0 | 38.9 | 0 | second flush |
2020-06-07 00:22 | 0 | 38.7 | 0 | second flush |
2020-06-07 00:23 | 0 | 0 | 0 | flushed |
2020-06-07 00:24 | 0 | 39.3 | 0 | third flush |
2020-06-07 00:25 | 0 | 39.2 | 0 | third flush |
2020-06-07 00:26 | 0 | 40.3 | 0 | third flush |
2020-06-07 00:27 | 0 | 0 | 0 | flushed and stopped |
2020-06-07 00:28 | 0 | 0 | 0 | flushed and stopped |
2020-06-07 00:29 | 0 | 0 | 0 | flushed and stopped |
Any tips?
I've implemented simple state machine based on "Current" and "Flow" column:
def state_machine():
current_state = None
current, flow = yield
while True:
c, flow = yield current_state
current_ddt = c - current
current = c
if current > 4.5:
if current_ddt <= -1:
current_state = "Ramping down"
else:
current_state = "Running"
elif current > 4:
if current_ddt < -1:
current_state = "Shutdown"
else:
current_state = "Ramp down complete between 4-4.5"
elif current > 0:
current_state = "Shutdown"
else:
states = iter(
[
"Purging",
"Purged",
"Second Flush",
"Flushed",
"Third Flush",
"Flushed and stopped",
]
)
# current is == 0, check the flow:
if flow == 0:
current_state = "De-energized"
waiting_for_zero = False
else:
current_state = next(states) # Purging
waiting_for_zero = True
while True:
current, flow = yield current_state
if flow > 0 and waiting_for_zero is False:
current_state = next(states)
waiting_for_zero = True
elif flow == 0 and waiting_for_zero is True:
current_state = next(states)
waiting_for_zero = False
if current_state == "Flushed and stopped":
# We are stopped completely, don't react to changes of "current" and/or "flow"
while True:
yield current_state
s = state_machine()
next(s)
df["Progress in shutdown cycle"] = df.apply(
lambda x: s.send((x["Current"], x["Flow"])), axis=1
)
print(df)
Prints:
Date and Time Current Flow Progress in shutdown cycle
0 2020-06-07 00:00 16.2 39.8 None
1 2020-06-07 00:01 15.1 40.3 Ramping down
2 2020-06-07 00:02 13.8 40.2 Ramping down
3 2020-06-07 00:03 12.0 40.1 Ramping down
4 2020-06-07 00:04 11.9 40.3 Running
5 2020-06-07 00:05 12.1 39.8 Running
6 2020-06-07 00:06 10.8 40.1 Ramping down
7 2020-06-07 00:07 9.8 40.2 Ramping down
8 2020-06-07 00:08 8.3 40.4 Ramping down
9 2020-06-07 00:09 6.2 39.6 Ramping down
10 2020-06-07 00:10 4.3 40.0 Shutdown
11 2020-06-07 00:11 4.2 39.3 Ramp down complete between 4-4.5
12 2020-06-07 00:12 4.2 40.7 Ramp down complete between 4-4.5
13 2020-06-07 00:13 3.3 38.9 Shutdown
14 2020-06-07 00:14 1.8 39.3 Shutdown
15 2020-06-07 00:15 0.0 0.0 De-energized
16 2020-06-07 00:16 0.0 0.0 De-energized
17 2020-06-07 00:17 0.0 39.3 Purging
18 2020-06-07 00:18 0.0 39.2 Purging
19 2020-06-07 00:19 0.0 0.0 Purged
20 2020-06-07 00:20 0.0 0.0 Purged
21 2020-06-07 00:21 0.0 38.9 Second Flush
22 2020-06-07 00:22 0.0 38.7 Second Flush
23 2020-06-07 00:23 0.0 0.0 Flushed
24 2020-06-07 00:24 0.0 39.3 Third Flush
25 2020-06-07 00:25 0.0 39.2 Third Flush
26 2020-06-07 00:26 0.0 40.3 Third Flush
27 2020-06-07 00:27 0.0 0.0 Flushed and stopped
28 2020-06-07 00:28 0.0 0.0 Flushed and stopped
29 2020-06-07 00:29 0.0 0.0 Flushed and stopped