How to pivot longer a longitudinal dataset

I have a longitudinal dataset which I created by merging different datasets using the personal identifier column.The dataset columns are in the order personal identifier, a_sex, a_countryofbirth, a_health, a_educationstatus, b_sex, b_countryofbirth, b_health, b_educationstatus, c_sex, c_countryofbirth, c_health, c_educationstatus etc all the way to l. All variables starting with a_ represented the first wave, variables starting with b represented the second wave and so on-

I am trying to create a new variable called Wave using Pivot longer so that my table looks like: -

Table: InPreg_transformed

 - Person ID    Wave  Sex CountryofBirth Health

I used this code among others but it did work.

InPreg_transformed<- InPregDF %>%

  pivot_longer(cols = contains("_"),

               names_to = c("_value", "Wave"),

               names_pattern = "(_+)"

The other code I used: -

InPreg_transformed<- InPreg %>%

          pivot_longer(cols = contains("."), names_to = c(".value", 
          "Wave"), names_pattern = "(.+).(.+)")

 summary(InPreg_transformed)

Please assist

Solution

Just to be sure that I understood correctly, I created a random nonsense example with n "individuals".

First load the librairies:

library(tibble)
library(dplyr)
library(tidyr)

Then create the dataset:

sex <- c("Male", "Female")
europe <- c("Belarus", "Belgium", "Bulgaria",
            "Croatia", "CzechRepublic", "Estonia", "France", 
            "Germany", "Hungary", "Ireland", "Italia", "Latvia", "Lithuania", 
            "Luxembourg", "Netherlands", "Poland", "Portugal", "Romania", 
            "Slovakia", "Slovenia", "Spain")
health <- c("Excellent", "Good", "Fair", "Poor")
education <- c("High School", "Bachelor's", "Master's", "PhD")
n <- 10

wdat <- tibble(
  ID = sprintf("Ind%02i", 1:n), # IDs
  a_sex = sample(sex, n, replace = TRUE),
  a_countryofbirth = sample(europe, n, replace = TRUE),
  a_health = sample(health, n, replace = TRUE),
  a_educationstatus = sample(education, 10, replace = TRUE),
  b_sex = sample(sex, n, replace = TRUE),
  b_countryofbirth = sample(europe, n, replace = TRUE),
  b_health = sample(health, n, replace = TRUE),
  b_educationstatus = sample(education, 10, replace = TRUE),
  c_sex = sample(sex, n, replace = TRUE),
  c_countryofbirth = sample(europe, n, replace = TRUE),
  c_health = sample(health, n, replace = TRUE),
  c_educationstatus = sample(education, 10, replace = TRUE))

The data wdat contains a unique ID for each individual and then three blocks of columns.

With this data, the syntax to transform it to a "long" format can be done with the pivot_longer function like so

wdat %>% 
  pivot_longer(
    -ID,
    names_to = c("wave", ".value"),
    names_pattern = "(.)_(.*)"
    )

where

names_pattern = "(.)_(.*)" means that there are two important pieces of information in the column name, a single character first and a string, separated with _,
names_to = c("wave", ".value") means that the single character will go in a column called wave, and the value in the wide columns will go in columns whose names are based on the common pattern, e.g., all the values in a_sex, b_sex and c_sex will go in a column called sex

EDIT: using names_sep is much easier in this case

wdat %>% 
  pivot_longer(
    -ID,
    names_sep = "_",
    names_to = c("wave", ".value")
  )