I have data on the outside temperature
df1 <- read.table(text = "DT temp.out
'2023-01-18 00:00:00' 6.8
'2023-02-18 23:00:00' 1.5
'2023-03-04 00:00:00' 2.6
'2023-04-20 03:00:00' -5.0
'2023-06-21 05:00:00' 11.9
'2023-08-10 19:00:00' 6.2
'2023-08-21 23:00:00' -2.8
'2023-09-19 01:00:00' 5.7
'2023-11-3 07:00:00' 9.1
'2023-12-21 13:00:00' -19.8", header = TRUE) %>%
mutate (DT = as.POSIXct(DT))
I want to get a summary of the number of observations that are less than or equal to the cutoff value. For the data presented, expected result:
> result
T.gr l.obs
1 -20 0
2 -19 1
3 -18 1
4 -17 1
5 -16 1
6 -15 1
7 -14 1
8 -13 1
9 -12 1
10 -11 1
11 -10 1
12 -9 1
13 -8 1
14 -7 1
15 -6 1
16 -5 2
17 -4 2
18 -3 2
19 -2 3
20 -1 3
21 0 3
22 1 3
23 2 4
24 3 5
25 5 5
26 6 6
27 7 8
28 8 8
29 9 8
30 10 9
31 11 9
32 12 10
33 13 10
34 14 10
35 15 10
36 16 10
37 17 10
38 18 10
39 19 10
40 20 10
>
The presented summary ranges from -20 to 20. Ultimately, the summary should start from -20 and end with the larger of the following two values:
Using cut
we divide the data in the required range. Use table
to calculate it's frequency, take a cumulative sum (cumsum
) and stack
it to present in a data frame format.
lvls <- seq(-21, max(20, max(df1$temp.out)+1))
cut(df1$temp.out, breaks = lvls, labels = lvls[-1]) |>
table() |>
cumsum() |>
stack() |>
rev()
Output of this is :
ind values
1 -20 0
2 -19 1
3 -18 1
4 -17 1
5 -16 1
6 -15 1
7 -14 1
8 -13 1
9 -12 1
10 -11 1
11 -10 1
12 -9 1
13 -8 1
14 -7 1
15 -6 1
16 -5 2
17 -4 2
18 -3 2
19 -2 3
20 -1 3
21 0 3
22 1 3
23 2 4
24 3 5
25 4 5
26 5 5
27 6 6
28 7 8
29 8 8
30 9 8
31 10 9
32 11 9
33 12 10
34 13 10
35 14 10
36 15 10
37 16 10
38 17 10
39 18 10
40 19 10
41 20 10