I'm trying to pivot a wide dataset to long and could not find a solution to my problem through searching here. I have a wide dataset that includes 3 friendship nominations over 3 waves, and then 3 follow up questions about each nomination at each wave. Sample data below shows the structure. The follow up questions are those with a, b, or c in them, and the number corresponds to the friend column at that wave (ex, w1a1 is wave 1 follow up question for friend 1 (w1fid1)). I have included a sample dataset of what I have and what I want:
Sample data:
df<-read.table(text= "id w1fid1 w1fid2 w1fid3 w2fid1 w2fid2 w2fid3 w3fid1 w3fid2 w3fid3 w1a1 w1a2 w1a3 w1b1 w1b2 w1b3 w1c1 w1c2 w1c3 w2a1 w2a2 w2a3 w2b1 w2b2 w2b3 w2c1 w2c2 w2c3 w3a1 w3a2 w3a3 w3b1 w3b2 w3b3 w3c1 w3c2 w3c3
1 10 3 9 4 13 9 15 14 NA 1 1 0 0 1 0 1 0 0 1 1 1 1 0 0 1 1 1 1 1 NA 1 0 NA 0 0 NA
2 13 NA 20 4 18 19 12 10 8 1 NA 0 0 NA 0 0 NA 0 1 0 0 1 1 0 0 1 0 0 0 0 1 0 0 1 0 0
3 NA 9 19 19 8 20 17 9 2 NA 1 1 NA 0 0 NA 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0
4 13 19 9 19 16 2 2 11 14 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0
5 11 8 9 18 20 7 9 17 12 0 1 1 0 0 0 1 1 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 0", header = TRUE)
Desired output:
df <- read.table(text= "id wave fid a b c
1 1 10 1 0 1
1 1 3 1 1 0
1 1 9 0 0 0
1 2 4 1 1 1
1 2 13 1 0 1
1 2 9 1 0 1
1 3 15 1 1 0
1 3 14 1 0 0
1 3 NA NA NA NA
2 1 14 1 0 0
2 1 NA NA NA NA
2 1 20 0 0 0
2 2 4 1 1 0
2 2 18 0 1 1
2 2 19 0 0 0
2 3 12 0 1 1
2 3 10 0 0 0
2 3 8 0 0 0
3 1 NA NA NA NA
3 1 9 1 0 0
3 1 19 1 0 0
3 2 19 0 0 1
3 2 8 0 0 0
3 2 20 1 0 0
3 3 17 0 0 1
3 3 9 0 0 1
3 3 2 1 0 0
4 1 13 1 1 1
4 1 19 1 1 1
4 1 9 1 1 0
4 2 19 0 0 0
4 2 16 0 0 0
4 2 2 1 1 0
4 3 2 0 0 0
4 3 11 0 0 1
4 3 14 0 1 0
5 1 11 0 0 1
5 1 8 1 0 1
5 1 9 1 0 1
5 2 18 1 1 1
5 2 20 0 1 0
5 2 7 1 0 1
5 3 9 1 1 0
5 3 17 0 1 1
5 3 12 0 1 0", header = TRUE)
I have used this code:
df %>% pivot_longer(-id,
names_to = c('wave', 'type', '.value'),
names_pattern = "(\\w+\\d+)(\\w+)(\\d+)")
Which gave me this:
id wave type `1` `2` `3`
1 w1 fid 10 3 9
1 w2 fid 4 13 9
1 w3 fid 15 14 NA
1 w1 a 1 1 0
1 w1 b 0 1 0
1 w1 c 1 0 0
1 w2 a 1 1 1
1 w2 b 1 0 0
1 w2 c 1 1 1
1 w3 a 1 1 NA
1 w3 b 1 0 NA
1 w3 c 0 0 NA
2 w1 fid 13 NA 20
2 w2 fid 4 18 19
2 w3 fid 12 10 8
2 w1 a 1 NA 0
2 w1 b 0 NA 0
2 w1 c 0 NA 0
2 w2 a 1 0 0
2 w2 b 1 1 0
2 w2 c 0 1 0
2 w3 a 0 0 0
2 w3 b 1 0 0
2 w3 c 1 0 0
3 w1 fid NA 9 19
3 w2 fid 19 8 20
3 w3 fid 17 9 2
3 w1 a NA 1 1
3 w1 b NA 0 0
3 w1 c NA 0 0
3 w2 a 0 0 1
3 w2 b 0 0 0
3 w2 c 1 0 0
3 w3 a 0 0 1
3 w3 b 0 0 0
3 w3 c 1 1 0
4 w1 fid 13 19 9
4 w2 fid 19 16 2
4 w3 fid 2 11 14
4 w1 a 1 1 1
4 w1 b 1 1 1
4 w1 c 1 1 0
4 w2 a 0 0 1
4 w2 b 0 0 1
4 w2 c 0 0 0
4 w3 a 0 0 0
4 w3 b 0 0 1
4 w3 c 0 1 0
5 w1 fid 11 8 9
5 w2 fid 18 20 7
5 w3 fid 9 17 12
5 w1 a 0 1 1
5 w1 b 0 0 0
5 w1 c 1 1 1
5 w2 a 1 0 1
5 w2 b 1 1 0
5 w2 c 1 0 1
5 w3 a 1 0 0
5 w3 b 1 1 1
5 w3 c 0 1 0
I assume I have to rewrite names_to and names_pattern to get what I want, but I'm stuck on the proper syntax to get there.
You can use names_pattern
and the special value .value
in names_to
:
library(tidyverse)
df %>%
pivot_longer(cols = -id,
names_pattern = '^w(\\d)(\\D+)\\d$',
names_to = c('wave', '.value'))
#> id wave fid a b c
#> 1 1 1 10 1 0 1
#> 2 1 1 3 1 1 0
#> 3 1 1 9 0 0 0
#> 4 1 2 4 1 1 1
#> 5 1 2 13 1 0 1
#> 6 1 2 9 1 0 1
#> 7 1 3 15 1 1 0
#> 8 1 3 14 1 0 0
#> 9 1 3 NA NA NA NA
#> 10 2 1 13 1 0 0
#> 11 2 1 NA NA NA NA
#> 12 2 1 20 0 0 0
#> 13 2 2 4 1 1 0
#> 14 2 2 18 0 1 1
#> 15 2 2 19 0 0 0
#> 16 2 3 12 0 1 1
#> 17 2 3 10 0 0 0
#> 18 2 3 8 0 0 0
#> 19 3 1 NA NA NA NA
#> 20 3 1 9 1 0 0
#> 21 3 1 19 1 0 0
#> 22 3 2 19 0 0 1
#> 23 3 2 8 0 0 0
#> 24 3 2 20 1 0 0
#> 25 3 3 17 0 0 1
#> 26 3 3 9 0 0 1
#> 27 3 3 2 1 0 0
#> 28 4 1 13 1 1 1
#> 29 4 1 19 1 1 1
#> 30 4 1 9 1 1 0
#> 31 4 2 19 0 0 0
#> 32 4 2 16 0 0 0
#> 33 4 2 2 1 1 0
#> 34 4 3 2 0 0 0
#> 35 4 3 11 0 0 1
#> 36 4 3 14 0 1 0
#> 37 5 1 11 0 0 1
#> 38 5 1 8 1 0 1
#> 39 5 1 9 1 0 1
#> 40 5 2 18 1 1 1
#> 41 5 2 20 0 1 0
#> 42 5 2 7 1 0 1
#> 43 5 3 9 1 1 0
#> 44 5 3 17 0 1 1
#> 45 5 3 12 0 1 0