I've got a data set with hourly air quality data for every day of the year. I'm trying to pivot wider and am running into issues when there's a null code in the NULLCODE column. These particular days pivot wider into multiple entries for the same day (for example with the small subset of my data below it splits into 3 entries for February 2nd, 2016) but I'd like only one entry for each day.
I'd like to pivot wider so that each hour has a column with values from PM25 and a column with values from NULLCODE to prevent each day from splitting into multiple rows, but am unsure how to do so.
To replicate my data:
library(tidyr)
data <- data.frame(DATE = c("2019-02-09"),
TIME = c("00:00", "01:00", "02:00","03:00", "04:00","05:00","06:00","07:00","08:00","09:00","10:00","11:00","12:00","13:00","14:00","15:00","16:00","17:00","18:00","19:00","20:00","21:00","22:00","23:00"),
PM25 = c("2","1","0","-1","-4","NA","NA","NA","-4","1","6","28","23","3","3","5","NA","9","24","24","17","16","9","15"),
NULLCODE = c("","","","","","DA","DA","DA","","","","","","","","","AM","","","","","","",""))
What my original data frame looks like
> print(data)
DATE TIME PM25 NULLCODE
1 2019-02-09 00:00 2
2 2019-02-09 01:00 1
3 2019-02-09 02:00 0
4 2019-02-09 03:00 -1
5 2019-02-09 04:00 -4
6 2019-02-09 05:00 NA DA
7 2019-02-09 06:00 NA DA
8 2019-02-09 07:00 NA DA
9 2019-02-09 08:00 -4
10 2019-02-09 09:00 1
11 2019-02-09 10:00 6
12 2019-02-09 11:00 28
13 2019-02-09 12:00 23
14 2019-02-09 13:00 3
15 2019-02-09 14:00 3
16 2019-02-09 15:00 5
17 2019-02-09 16:00 NA AM
18 2019-02-09 17:00 9
19 2019-02-09 18:00 24
20 2019-02-09 19:00 24
21 2019-02-09 20:00 17
22 2019-02-09 21:00 16
23 2019-02-09 22:00 9
24 2019-02-09 23:00 1
Here is the code I used to pivot wider that is creating three rows for 2016-02-09:
data2 <- data %>% pivot_wider(
names_from = TIME,
names_prefix = "Time_",
values_from = PM25
)
What my data frame looks like after pivoting
> print(data2)
DATE NULLCODE Time_00:00 Time_01:00 Time_02:00 Time_03:00 Time_04:00
1 2019-02-09 2 1 0 -1 -4
2 2019-02-09 DA <NA> <NA> <NA> <NA> <NA>
3 2019-02-09 AM <NA> <NA> <NA> <NA> <NA>
Time_05:00 Time_06:00 Time_07:00 Time_08:00 Time_09:00 Time_10:00 Time_11:00
1 <NA> <NA> <NA> -4 1 6 28
2 NA NA NA <NA> <NA> <NA> <NA>
3 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Time_12:00 Time_13:00 Time_14:00 Time_15:00 Time_16:00 Time_17:00 Time_18:00
1 23 3 3 5 <NA> 9 24
2 <NA> <NA> <NA> <NA> <NA> <NA> <NA>
3 <NA> <NA> <NA> <NA> NA <NA> <NA>
Time_19:00 Time_20:00 Time_21:00 Time_22:00 Time_23:00
1 24 17 16 9 15
2 <NA> <NA> <NA> <NA> <NA>
3 <NA> <NA> <NA> <NA> <NA>
Was able to get the correct pivoting with the below:
data2 <- data %>% pivot_wider(
names_from = TIME,
names_prefix = "Time_",
values_from = c(PM25, NULLCODE)
)