Search code examples
rdplyrradix

How can I rebuild my dataset by separating year column in R


I have this dataset

GDP growth rate with year in GDP column

But I want to have a dataset like this using R

GDP growth rate with year in separate column


Solution

  • We may use pivot_longer for that

    library(tidyr)
    library(dplyr)
    pivot_longer(df1, cols = starts_with("GDP"), names_to = c(".value", "Year"),
       names_pattern = "([^\\d]+)(\\d+)") %>%
       rename(`Growth rate` = GDP_GR)
    

    -output

    # A tibble: 4 × 4
      `Country Name` `Country Code` Year  `Growth rate`
      <chr>          <chr>          <chr>         <dbl>
    1 Afghanistan    AFG            2011             NA
    2 Afghanistan    AFG            2012     1234143668
    3 Albania        ALB            2011     2703864872
    4 Albania        ALB            2012    -4429023858
    

    Or without rename step

    pivot_longer(df1, cols = starts_with("GDP"), names_to = "Year", 
         values_to = "Growth rate", names_pattern = "\\D+(\\d+)")
    # A tibble: 4 × 4
      `Country Name` `Country Code` Year  `Growth rate`
      <chr>          <chr>          <chr>         <dbl>
    1 Afghanistan    AFG            2011             NA
    2 Afghanistan    AFG            2012     1234143668
    3 Albania        ALB            2011     2703864872
    4 Albania        ALB            2012    -4429023858
    

    data

    df1 <- structure(list(`Country Name` = c("Afghanistan", "Albania"), 
        `Country Code` = c("AFG", "ALB"), GDP_GR2011 = c(NA, 2703864872
        ), GDP_GR2012 = c(1234143668, -4429023858)), 
    class = c("tbl_df", 
    "tbl", "data.frame"), row.names = c(NA, -2L))