I have a data set that looks like this:
Date | hh00 | hh01 | hh02 | hh03 | hh04 | hh05 |
---|---|---|---|---|---|---|
2021-09-01 | 21 | 0 | 51 | 12 | 65 | 96 |
2021-09-01 | 0 | 13 | 12 | 34 | 65 | 34 |
2021-09-03 | 12 | 0 | 0 | 1 | 61 | 96 |
2021-09-03 | 45 | 12 | 51 | 12 | 11 | 0 |
2021-09-05 | 10 | 0 | 51 | 98 | 23 | 96 |
Where the date column is the date and the hh columns stand for the hour of observation for each day. So in the first row on 2021-09-01 there is an observation of 21 at 00:00 hours and observation of 0 at 01:00 hours etc.
I'd like transform it get a data set containing the sum of the each column by the date and hour, then organized by the datetime. So it looks like this:
Date | count |
---|---|
2021-09-01 00:00 | 21 |
2021-09-01 01:00 | 13 |
2021-09-01 02:00 | 63 |
2021-09-01 03:00 | 36 |
2021-09-01 04:00 | 130 |
2021-09-01 05:00 | 130 |
2021-09-03 00:00 | 57 |
2021-09-03 01:00 | 12 |
2021-09-03 02:00 | 51 |
2021-09-03 03:00 | 13 |
2021-09-03 04:00 | 72 |
2021-09-03 05:00 | 96 |
library(tidyverse)
library(lubridate) # loaded with tidyverse 2.0.0+
df |>
pivot_longer(-Date) |>
mutate(Date = ymd_h(paste(Date, parse_number(name)))) |>
summarize(value = sum(value), .by = Date)
Date value
<dttm> <int>
1 2021-09-01 00:00:00 21
2 2021-09-01 01:00:00 13
3 2021-09-01 02:00:00 63
4 2021-09-01 03:00:00 46
5 2021-09-01 04:00:00 130
6 2021-09-01 05:00:00 130
7 2021-09-03 00:00:00 57
8 2021-09-03 01:00:00 12
9 2021-09-03 02:00:00 51
10 2021-09-03 03:00:00 13
11 2021-09-03 04:00:00 72
12 2021-09-03 05:00:00 96
13 2021-09-05 00:00:00 10
14 2021-09-05 01:00:00 0
15 2021-09-05 02:00:00 51
16 2021-09-05 03:00:00 98
17 2021-09-05 04:00:00 23
18 2021-09-05 05:00:00 96