I'm working with a data set that looks something like this. I'm trying to use pivot_longer and dplyr to get the means and sd under their own respective columns so I can do some plotting but I can't seem to get it to work properly. Could someone give me some tips?
day | mean_depth | depth_sd | mean_temp | temp_sd |
---|---|---|---|---|
2021-01-01 | 20.5 | 5.2 | 15.8 | 3.8 |
2021-01-02 | 15.5 | 3.9 | 15.6 | 2.2 |
I hope to end up with a table that looks like this:
day | variable | mean | sd |
---|---|---|---|
2021-01-01 | depth | 20.5 | 5.2 |
2021-01-01 | temp | 15.8 | 3.8 |
2021-01-02 | depth | 15.5 | 3.9 |
2021-01-02 | temp | 15.6 | 2.2 |
If you clean up the column names so the variable is awlays first or always last, this is easier. For example
library(dplyr)
library(tidyr)
dd %>%
rename_with(.cols=starts_with("mean"), function(x) gsub("([^_]*)_(.*)", "\\2_\\1", x)) %>%
pivot_longer(-day, names_pattern="(.*)_(.*)", names_to=c("variable", ".value"))
This returns
day variable mean sd
<chr> <chr> <dbl> <dbl>
1 2021-01-01 depth 20.5 5.2
2 2021-01-01 temp 15.8 3.8
3 2021-01-02 depth 15.5 3.9
4 2021-01-02 temp 15.6 2.2
Tested with
dd <- read.table(text="day mean_depth depth_sd mean_temp temp_sd
2021-01-01 20.5 5.2 15.8 3.8
2021-01-02 15.5 3.9 15.6 2.2", header=T)