I am trying to use pivot_longer to move my data from wide to long in R. However, I have multiple different variable that will need to be created. I want to create a column called "Assessment" that has "pre" or "post" within it. Then I want a column that says "Test" with "A or B" in it, then I want a "Error" column with the values for only the columns that currently say "error" and then a "Time" column for the values for only the columns that currently say "time" in them. Right now the data looks like this:
data.frame(
stringsAsFactors = FALSE,
Testing.Date = c("5/25/23","5/30/23","5/31/23",
"6/1/23","6/27/23","6/20/23"),
ID = c(101L, 103L, 106L, 109L, 110L, 201L),
Condition = c("Lab", "Nature", "Nature", "Lab", "Nature", "Lab"),
Age = c(21L, 22L, 22L, 21L, 29L, 56L),
Pre.Time.A = c(22.78, 15.75, 30.33, 18.43, 26.99, 24.94),
Pre.Error.A = c(0L, 0L, 1L, 0L, 0L, 0L),
Pre.Time.B = c(60.38, 61.65, 65.6, 65.73, 58.63, 60.97),
Pre.Error.B = c(0L, 0L, 1L, 2L, 0L, 0L),
Post.Time.A = c(21.48, 14.65, 20.66, 24.55, 23.87, 35.7),
Post.Error.A = c(0L, 0L, 0L, 0L, 0L, 0L),
Post.Time.B = c(45.88, 48.58, 56.45, 48.95, 53, 64.3),
Post.Error.B = c(0L, 0L, 0L, 0L, 0L, 0L)
)
I want it to look like this:
TMT_long <- tibble::tribble(
~Testing.Date, ~ID, ~Condition, ~Age, ~Assessment, ~Test, ~Time, ~Errors,
"5/25/23", 101L, "Lab", 21L, "Pre", "A", 22.78, 0L,
"5/25/23", 101L, "Lab", 21L, "Pre", "B", 60.38, 0L,
"5/25/23", 101L, "Lab", 21L, "Post", "A", 21.48, 0L,
"5/25/23", 101L, "Lab", 21L, "Post", "B", 45.88, 0L
)
I can't seem to figure out the right syntax in pivot_longer
Here's a one-liner:
tidyr::pivot_longer(df, matches("P"), names_pattern = "(.*)\\.(.*)\\.(.*)", names_to = c("Assessment", ".value", "Test"), values_drop_na = TRUE)
Output:
# A tibble: 24 × 8
Testing.Date ID Condition Age Assessment Test Time Error
<chr> <int> <chr> <int> <chr> <chr> <dbl> <int>
1 5/25/23 101 Lab 21 Pre A 22.8 0
2 5/25/23 101 Lab 21 Pre B 60.4 0
3 5/25/23 101 Lab 21 Post A 21.5 0
4 5/25/23 101 Lab 21 Post B 45.9 0
5 5/30/23 103 Nature 22 Pre A 15.8 0
6 5/30/23 103 Nature 22 Pre B 61.6 0
7 5/30/23 103 Nature 22 Post A 14.6 0
8 5/30/23 103 Nature 22 Post B 48.6 0
9 5/31/23 106 Nature 22 Pre A 30.3 1
10 5/31/23 106 Nature 22 Pre B 65.6 1
# ℹ 14 more rows
Notes:
matches()
idea, I hadn't seen that beforenames_pattern
essentially matches the column names as a regex, turning the prefix and suffix values into values in the Assessment and Test columns respectively, and the central part into the values of the time and error columns