I made an example data frame for my situation:
df <- data.frame(ID=c(1,2,3,4), AGE1=c(25,32,29,45), AGE2=c(27,34,31,47), AGE3=c(29,36,33,49), AGE4=c(31,38,35,51), SCORE1=c(20,9,12,19), SCORE2=c(9,10,17,12), SCORE3=c(15,12,13,15), SCORE4=c(13,12,8,12))
The format of my data frame is longitudinal, so the age of each person increases with each response, and the corresponding scores are also recorded. AGE1 corresponds to SCORE1, AGE2 with SCORE2, etc.
In the end, the x-axis should be ages and the y-axis should be scores. Each row should have their own line with 4 data points.
I made what the data frame should look like (after the transformation). Then I can group by ID and overlay the lines onto the plot, I think:
It seems like I would have to use pivot_longer or melt but I wasn't able to figure out how to map each age column with the score columns. Any help would be appreciated.
Three possible ways:
pivot_longer
from tidyr
with names_sep
argument. For this we rename all relevant columns to put in a separator (here we use _
)pivot_longer
from tidyr
with names_pattern
argument. Here we do not have to rename the columns, but we have to use special regex: "(.*?)(\\d+)$"
pivot_longer
from tidyr
with names_sep
argument but without renaming again with special regex:Way 1:
library(dplyr)
library(tidyr)
library(stringr)
df %>%
rename_with(., ~str_replace(., 'E', 'E_')) %>%
pivot_longer(
cols = -ID,
names_to = c(".value", "name"),
names_sep = "_") %>%
select(-name)
Way 2:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
cols = -ID,
names_to = c(".value", "name"),
names_pattern = "(.*?)(\\d+)$") %>%
select(-name)
Way 3:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
cols = -ID,
names_to = c(".value", "name"),
names_sep = "(?<=[A-Za-z])(?=[0-9])") %>%
select(-name)
ID AGE SCORE
<dbl> <dbl> <dbl>
1 1 25 20
2 1 27 9
3 1 29 15
4 1 31 13
5 2 32 9
6 2 34 10
7 2 36 12
8 2 38 12
9 3 29 12
10 3 31 17
11 3 33 13
12 3 35 8
13 4 45 19
14 4 47 12
15 4 49 15
16 4 51 12