Fill in missing rows

I have a data frame of county executives and the year they were inaugurated.

I am runnig a panel study with county-year as the unit of analyis. The date range is 2000 to 2004.

I will like to expand the df such that it lists who was the county executive during each year between the years 2000 and 2004.

For instance, I would like this df

df <- data.frame(year= c(2000, 2001, 2003, 2000, 2002, 2004),
                  executive.name= c("Johnson", "Smith", "Alleghany", "Roberts", "Clarke", "Tollson"),
                 party= c("PartyRed", "PartyYellow", "PartyGreen", "PartyYellow", "PartyOrange", "PartyRed"),
                  district= rep(c(1001, 1002), each=3))

to look like this

df.neat <- data.frame(year= c(2000, 2001, 2002, 2003, 2004, 2000, 2001, 2002, 2003, 2004),
                  executive.name= c("Johnson", "Smith", "Smith", "Alleghany", "Alleghany", "Roberts", "Roberts", "Clarke", "Clarke", "Tollson"),
                  party= c("PartyRed", "PartyYellow", "PartyYellow", "PartyGreen", "PartyGreen", "PartyYellow", "PartyYellow", "PartyOrange", "PartyOrange", "PartyRed"),
                  district= rep(c(1001, 1002), each=5))

Solution

df |>
  tidyr::complete(district, year) |>
  dplyr::group_by(district) |>
  tidyr::fill(executive.name, party) |>
  dplyr::ungroup()

Result

# A tibble: 10 × 4
   district  year executive.name party      
      <dbl> <dbl> <chr>          <chr>      
 1     1001  2000 Johnson        PartyRed   
 2     1001  2001 Smith          PartyYellow
 3     1001  2002 Smith          PartyYellow
 4     1001  2003 Alleghany      PartyGreen 
 5     1001  2004 Alleghany      PartyGreen 
 6     1002  2000 Roberts        PartyYellow
 7     1002  2001 Roberts        PartyYellow
 8     1002  2002 Clarke         PartyOrange
 9     1002  2003 Clarke         PartyOrange
10     1002  2004 Tollson        PartyRed