Sample data I receive from the device's operation recorder
df1 <- read.table(text = "temp.1
heating
heating
heating
heating
heating
heating
heating
heating
cooling
heating
heating
heating
heating
heating
heating
cooling
cooling
cooling
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
cooling
heating
heating
heating
cooling
heating
heating
heating
heating
cooling
cooling
cooling
cooling
heating
heating
heating
cooling
heating
cooling
heating
cooling
heating
heating
heating
heating", header = TRUE)
Occasionally, a single (up to double) "cooling" observation will occur during "heating". This is an error and I would like these values to be ignored. I would like to mark the duty cycles after such a correction. The marking should also contain a sequential number - information is needed on how many heating and cooling cycles occurred on a given day Expected result:
> df1
temp.1 level
1 heating H.1
2 heating H.1
3 heating H.1
4 heating H.1
5 heating H.1
6 heating H.1
7 heating H.1
8 heating H.1
9 cooling H.1
10 heating H.1
11 heating H.1
12 heating H.1
13 heating H.1
14 heating H.1
15 heating H.1
16 cooling C.1
17 cooling C.1
18 cooling C.1
19 cooling C.1
20 cooling C.1
21 cooling C.1
22 cooling C.1
23 heating H.2
24 heating H.2
25 heating H.2
26 cooling H.2
27 cooling H.2
28 heating H.2
29 heating H.2
30 heating H.2
31 cooling H.2
32 heating H.2
33 heating H.2
34 heating H.2
35 heating H.2
36 cooling C.2
37 cooling C.2
38 cooling C.2
39 cooling C.2
40 heating H.3
41 heating H.3
42 heating H.3
43 cooling H.3
44 heating H.3
45 cooling H.3
46 heating H.3
47 cooling H.3
48 heating H.3
49 heating H.3
50 heating H.3
51 heating H.3
EDIT2: There was one more case I hadn't anticipated and my query wasn't precise. Please look at verses 51-53. When a "cooling" series is interrupted by a single "heating" it should also be ignored. I tried to modify your solution, but I had no success
df1
temp.1 level
1: heating H.1
2: heating H.1
3: heating H.1
4: heating H.1
5: heating H.1
6: heating H.1
7: heating H.1
8: heating H.1
9: cooling H.1
10: heating H.1
11: heating H.1
12: heating H.1
13: heating H.1
14: heating H.1
15: heating H.1
16: cooling C.1
17: cooling C.1
18: cooling C.1
19: cooling C.1
20: cooling C.1
21: cooling C.1
22: cooling C.1
23: heating H.2
24: heating H.2
25: heating H.2
26: cooling H.2
27: cooling H.2
28: heating H.2
29: heating H.2
30: heating H.2
31: cooling H.2
32: heating H.2
33: heating H.2
34: heating H.2
35: heating H.2
36: cooling C.2
37: cooling C.2
38: cooling C.2
39: cooling C.2
40: heating H.3
41: heating H.3
42: heating H.3
43: cooling H.3
44: heating H.3
45: cooling H.3
46: heating H.3
47: cooling C.3
48: cooling C.3
49: cooling C.3
50: cooling C.3
51: cooling C.3
52: heating C.3
53: cooling C.3
54: cooling C.3
55: cooling C.3
56: heating H.4
57: heating H.4
58: heating H.4
Appearing "cooling" after "heating" 3 times or "heating" after "cooling" 3 times changes the category to "level". Therefore, lines 26-27 are considered errors, and lines 23-25 are supposed to change the "level".
a data.table
approach
library(data.table)
# set to data.table format
setDT(df1)
# initialise heating or cooling level
df1[, level := toupper(substr(temp.1,1,1))]
# override level of groupsizes size 2 or less with "H"
df1[, level := if (.N <= 2) "H", by = .(rleid(temp.1))]
# tamporary value for indexing, can be dropped at the end
df1[, temp := rleid(level)]
# create the correct level id, and afterwards drop the temp column
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]
update for updated sample data / desired output
library(data.table)
setDT(df1)
# determine groups of 3 (or more) consecutive temp.1
df1[, group := if (.N >= 3) .GRP, by = .(rleid(temp.1))]
# fill down missing groupnumbers
setnafill(df1, type = "locf", cols = "group")
# set level letter (from initial answer)
df1[, level := toupper(substr(temp.1[1],1,1)), by = .(group)]
df1[, temp := rleid(level)]
df1[, level := paste(level, as.integer(factor(temp)), sep = "."), by = .(level)][, temp := NULL][]