I am working with a dataset of U.S. House elections and want to create a variable for incumbency. In particular, I want the variable to be equal to 1 for a party that won the last election and is running again. While I have used case_when()
to specify multiple contemporaneous conditions before, I cannot seem to successfully use the function (or any others I know) to assign a 1 to the new incumbency variable based on past election outcomes. This appears to be a very simple task but is frustrating me. Any help or insight would be appreciated.
For example, here is code for a simulated dataset that is representative (except that the winners are by plurality rule for ease).
library(tidyverse)
data<-data.frame(
party=rep(1:2,100),
district=rep(1:2,each=2,50),
state=rep(1:2,each=4,25),
year=rep(1:25,each=8)
) %>%
mutate(voteshare=rnorm(200,mean=50,sd=10)) %>%
group_by(year,state,district) %>%
mutate(rank=dense_rank(desc(voteshare)))
After ranking the contemporaneous outcomes, I would like to mutate a new variable (i
for short) where it is equal to 1 if rank
= 1 in the past election (year
- 1). The closest I have gotten is the following, where it simply skips the first election but then bases i
off of contemporaneous rank
, not from the year before as I would like.
data<-data %>%
group_by(state,district) %>%
mutate(i=case_when(
rank==1 & year-1 ~ 1
))
I have looked at other similar stackoverflow questions but the answers were not clearly applicable to me. I apologize if this question is a duplicate due to my lack of understanding.
To be clear, I am expecting to have a new column (i
) with ones and zeros or NA
's otherwise. The ones would be assigned to parties in the current election year that won (rank
= 1) in the previous election year, with presumably NA
's for the first election year (since there is no past election to tell incumbency from). Thank you in advance for any help!
If I understand your expected results correctly, you may want to try something like this:
data <- data %>%
arrange(state, district, party, year) %>% # order year within party
group_by(state, district, party) %>%
mutate(i = as.numeric(lag(rank) == 1))