Search code examples
rdataframedata-manipulation

how to stack the same data frame in itself and change a value in a column per stack in r


I wasn't sure how to word this question.

I have a data frame called p08, that shows how each state voted in the 2008 election. The indicator variable is named 'DemStatus' where 1==voted democrat and 0==voted republican.

I want to label each state as republican and democrat for all four years between elections. For instance, Alabama voted republican in 2008, so I want to label them as 0 (republican) for 2008, 2009,2010, and 2011.

I accomplished this by copying the data frame and naming it something else for each year. However, this is a very tedious process since I have election data from the 90s until 2020.

QUESTION: Is there a faster/simpler way to accomplish exactly what I have below?

p08=structure(list(STATE = c("Alabama", "Alaska", "Arizona", "Arkansas", 
"California", "Colorado", "Connecticut", "Delaware", "Dist. of Col.", 
"Florida", "Georgia", "Hawaii", "Idaho", "Illinois", "Indiana", 
"Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", "Maryland", 
"Massachusetts", "Michigan", "Minnesota", "Mississippi", "Missouri", 
"Montana", "Nebraska", "Nevada", "New Hampshire", "New Jersey", 
"New Mexico", "New York", "North Carolina", "North Dakota", "Ohio", 
"Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", "South Carolina", 
"South Dakota", "Tennessee", "Texas", "Utah", "Vermont", "Virginia", 
"Washington", "West Virginia", "Wisconsin", "Wyoming"), YEAR = c(2008, 
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 2008, 
2008, 2008, 2008, 2008, 2008, 2008), DemStatus = c(0, 0, 0, 0, 
1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 
0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 
1, 1, 0, 1, 0)), row.names = 10:60, class = "data.frame")

party09=p08
party09$YEAR=2009

party10=p08
party10$YEAR=2010

party11=p08
party11$YEAR=2011

party08_11 = bind_rows(p08,party09,party10,party11)

Solution

  • Another way would be creating a sequence per each row. This could be applied to any dataset regardless of which or how many election years it contains, provided that there is always a gap of 3 years.

    library(data.table)
    
    p08 <- setDT(p08)[, .(STATE, YEAR = seq(YEAR, YEAR + 3L), DemStatus), by = 1:nrow(p08)][, nrow := NULL]
    

    Output (showing first and last 5 rows):

             STATE YEAR DemStatus
      1:   Alabama 2008         0
      2:   Alabama 2009         0
      3:   Alabama 2010         0
      4:   Alabama 2011         0
      5:    Alaska 2008         0
     ---                         
    200: Wisconsin 2011         1
    201:   Wyoming 2008         0
    202:   Wyoming 2009         0
    203:   Wyoming 2010         0
    204:   Wyoming 2011         0