I was wonder if there is a way for my DATA
to be reformatted to my Desired_output
below?
Specifically, for each unique study
, we stick together a pair of pre
and postNUMBER
together, separately for t
and c
.
For example, for study == 2
, we stick together a pair of pre
and postNUMBER
together going FROM:
study time nt nc mt mc sdt sdc outcome
2 pre 38 48 1.89 2.22 0.40 0.76 fram
2 post1 38 48 4.07 2.52 2.20 1.58 fram
TO:
study time group n mpre mpost sdpre sdpost outcome
2 pre-post1 t 38 1.89 4.07 0.40 2.20 fram
2 pre-post1 c 48 2.22 2.52 0.76 1.58 fram
ps. If a unique study
in DATA
, didn't have a pre
(only consisted of postNUMBER
) or postNUMBER
(only consisted of pre
), we simply leave it out.
DATA <- read.table(header=T, text="
study time nt nc mt mc sdt sdc outcome
1 pre 28 58 0.89 1.22 1.40 1.76 Conv
1 post1 28 58 5.07 3.52 3.20 2.58 Conv
1 post2 28 58 3.64 2.86 3.15 2.80 Conv
2 pre 38 48 1.89 2.22 0.40 0.76 fram
2 post1 38 48 4.07 2.52 2.20 1.58 fram
3 post1 31 18 2.07 1.52 1.20 0.58 voca
3 post2 32 18 3.07 2.32 1.12 9.85 voca
")
Desired_output <- read.table(header=T, text="
study time group n mpre mpost sdpre sdpost outcome
1 pre-post1 t 28 0.89 5.07 1.40 3.20 Conv
1 pre-post2 t 28 0.89 3.64 1.40 3.15 Conv
1 pre-post1 c 58 1.22 3.52 1.76 2.58 Conv
1 pre-post2 c 58 1.22 2.86 1.76 2.80 Conv
2 pre-post1 t 38 1.89 4.07 0.40 2.20 fram
2 pre-post1 c 48 2.22 2.52 0.76 1.58 fram
")
A tidyverse
suggestion is to pivot_longer()
in a first time, to have one row per study-time-group (and a column for n , m and sd). In a second time, you can left_join
the "post" rows (filter(time!="pre"
) with "pre" rows (filter(time=="pre"
); with the appropriate argument.
I select
some columns. Notice that I kept the row without pre
with some NA
; they could be filtered if you want to leave them out.
library(tidyverse)
DATA <- read.table(header=T, text="
study time nt nc mt mc sdt sdc outcome
1 pre 28 58 0.89 1.22 1.40 1.76 Conv
1 post1 28 58 5.07 3.52 3.20 2.58 Conv
1 post2 28 58 3.64 2.86 3.15 2.80 Conv
2 pre 38 48 1.89 2.22 0.40 0.76 fram
2 post1 38 48 4.07 2.52 2.20 1.58 fram
3 post1 31 18 2.07 1.52 1.20 0.58 voca
3 post2 32 18 3.07 2.32 1.12 9.85 voca
")
data2 =
DATA %>% pivot_longer(nt:sdc,
names_to = c(".value","group"),
names_pattern = "(n|m|sd)(t|c)")
left_join(data2 %>% filter(time !="pre"),
data2 %>% filter(time =="pre"),
by = c("study","group"),
suffix= c("post","pre") ) %>%
select(study, time = timepost, group, npre, npost, mpre, mpost, sdpre,sdpost,
outcome = outcomepost)
#> # A tibble: 10 × 10
#> study time group npre npost mpre mpost sdpre sdpost outcome
#> <int> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 1 post1 t 28 28 0.89 5.07 1.4 3.2 Conv
#> 2 1 post1 c 58 58 1.22 3.52 1.76 2.58 Conv
#> 3 1 post2 t 28 28 0.89 3.64 1.4 3.15 Conv
#> 4 1 post2 c 58 58 1.22 2.86 1.76 2.8 Conv
#> 5 2 post1 t 38 38 1.89 4.07 0.4 2.2 fram
#> 6 2 post1 c 48 48 2.22 2.52 0.76 1.58 fram
#> 7 3 post1 t NA 31 NA 2.07 NA 1.2 voca
#> 8 3 post1 c NA 18 NA 1.52 NA 0.58 voca
#> 9 3 post2 t NA 32 NA 3.07 NA 1.12 voca
#> 10 3 post2 c NA 18 NA 2.32 NA 9.85 voca
Created on 2024-05-07 with reprex v2.0.2