I'm not sure why I'm struggling with this, but I'm trying to create a dataset where each subject ("id" in this case) has an individual IQ score. They must also read 20 letters, each letter having a unique score attached to it ("value"). In theory what I want is the 300 people in this dataset to each "read" each letter, but have a constant IQ for themselves and a constant value for each letter. For example, Subject 1 should have read letters A to T with an IQ that is randomly normally distributed. So far this is what I have:
id <- 1:300
iq <- rnorm(n=300, mean=120, sd=15)
letter <- rep(c("a","b","c","d","e","f","g","h","i","j",
"k","l","m","n","o","p","q","r","s","t"),15)
value <- rep(c(2,2,1,2,2,2,2,2,3,2,
3,1,3,2,1,2,2,2,1,2),15)
df <- data.frame(id,iq,letter,value)
df$id <- as.character(id)
This of course isn't helpful, if I run the head of the dataframe:
head(df)
You can see that each person has a unique IQ score, but only reads one letter, not all of them:
id iq letter value
1 1 126.35025 a 2
2 2 150.08165 b 2
3 3 105.88712 c 1
4 4 106.86652 d 2
5 5 97.86159 e 2
6 6 116.39497 f 2
What I want is something more like this:
id2 <- rep(1,4)
iq2 <- 120
letter2 <- c("a","b","c","d")
value2 <- c(2,2,1,2)
df2 <- data.frame(id2,
iq2,
letter2,
value2)
Which gives this frame for one person who "reads" 4 letters
id2 iq2 letter2 value2
1 1 120 a 2
2 1 120 b 2
3 1 120 c 1
4 1 120 d 2
How do I solve this problem?
A solution using tidyr::crossing()
and inner_join()
:
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.2.1
#> Warning: package 'tibble' was built under R version 4.2.1
value <- c(2, 2, 1, 2, 2, 2, 2, 2, 3, 2, 3, 1, 3, 2, 1, 2, 2, 2, 1, 2)
df_merged <- tibble(id = 1:300,
iq = rnorm(n = 300, mean = 120, sd = 15)) |>
inner_join(crossing(id = 1:300,
letter = letters[1:20])) |>
mutate(value = rep(value, 300))
#> Joining, by = "id"
#select a random id
df_merged |>
filter(id == 5)
#> # A tibble: 20 × 4
#> id iq letter value
#> <int> <dbl> <chr> <dbl>
#> 1 5 116. a 2
#> 2 5 116. b 2
#> 3 5 116. c 1
#> 4 5 116. d 2
#> 5 5 116. e 2
#> 6 5 116. f 2
#> 7 5 116. g 2
#> 8 5 116. h 2
#> 9 5 116. i 3
#> 10 5 116. j 2
#> 11 5 116. k 3
#> 12 5 116. l 1
#> 13 5 116. m 3
#> 14 5 116. n 2
#> 15 5 116. o 1
#> 16 5 116. p 2
#> 17 5 116. q 2
#> 18 5 116. r 2
#> 19 5 116. s 1
#> 20 5 116. t 2
Created on 2022-07-29 by the reprex package (v2.0.1)