Search code examples
rdplyrtidyversetidyrdata-cleaning

Creating new variable by focusing range of other variables


I want to create a variable that include all numbers between (startyear) and (endyear - 1). My data looks like this:

country leader startyear endyear
US Eisenhower 1953 1961
US Kennedy 1961 1963

I want to show my data like this:

country leader startyear endyear year
US Eisenhower 1953 1961 1953
US Eisenhower 1953 1961 1954
US Eisenhower 1953 1961 1955
US Eisenhower 1953 1961 1956
US Eisenhower 1953 1961 1957
US Eisenhower 1953 1961 1958
US Eisenhower 1953 1961 1959
US Eisenhower 1953 1961 1960
US Kennedy 1961 1963 1961
US Kennedy 1961 1963 1962

I have many countries in data set. I want to manipulate all data set with "the" code.


Solution

  • We may get the sequence (:) by row and unnest the list column

    library(dplyr)
    library(purrr)
    library(tidyr)
    df1 %>%
       mutate(year = map2(startyear, endyear-1, `:`)) %>%
       unnest(year)
    

    -output

    # A tibble: 10 × 5
       country leader     startyear endyear  year
       <chr>   <chr>          <int>   <int> <int>
     1 US      Eisenhower      1953    1961  1953
     2 US      Eisenhower      1953    1961  1954
     3 US      Eisenhower      1953    1961  1955
     4 US      Eisenhower      1953    1961  1956
     5 US      Eisenhower      1953    1961  1957
     6 US      Eisenhower      1953    1961  1958
     7 US      Eisenhower      1953    1961  1959
     8 US      Eisenhower      1953    1961  1960
     9 US      Kennedy         1961    1963  1961
    10 US      Kennedy         1961    1963  1962
    

    data

    df1 <- structure(list(country = c("US", "US"), leader = c("Eisenhower", 
    "Kennedy"), startyear = c(1953L, 1961L), endyear = c(1961L, 1963L
    )), class = "data.frame", row.names = c(NA, -2L))