Search code examples
rexcelmissing-data

Fill in the missing values if rows in other columns are the same


I have a table which looks the following way:

Name Region Id
Name1 US 123
Name1 US
Name2 US 122
Name3 US 124
Name1 UK
Name1 UK 135
Name2 UK 140
Name3 US

As you can see there are empty values in the ID column which I want to be filled with the values of other rows if the column Name and Region correspond. So it should look like this:

Name Region Id
Name1 US 123
Name1 US 123
Name2 US 122
Name3 US 124
Name1 UK 135
Name1 UK 135
Name2 UK 140
Name3 US

Some of the columns will still be left empty but the majority will be fine. Is there a way to do so in R? Maybe it will be easier to so in Excel even (Vlookup maybe I am not sure). Share if you have any ideas)


Solution

  • The order of rows is important here Eg - Name3 US in row 4 and row 8 are not related and treated differently. We can create a new group using consecutive_id and then fill the Id values for each group.

    library(dplyr)
    library(tidyr)
    
    df %>%
      mutate(group_id = consecutive_id(Name, Region)) %>%
      group_by(group_id) %>%
      fill(Id, .direction = "downup") %>%
      ungroup
    
    # A tibble: 8 × 4
    #  Name  Region    Id group_id
    #  <chr> <chr>  <int>    <int>
    #1 Name1 US       123        1
    #2 Name1 US       123        1
    #3 Name2 US       122        2
    #4 Name3 US       124        3
    #5 Name1 UK       135        4
    #6 Name1 UK       135        4
    #7 Name2 UK       140        5
    #8 Name3 US        NA        6
    

    group_id column is just for explanation purposes on how groups are created, you may drop the column if you don't need it.

    data

    It is easier to help if you provide data in a reproducible format

    df <- structure(list(Name = c("Name1", "Name1", "Name2", "Name3", "Name1", 
    "Name1", "Name2", "Name3"), Region = c("US", "US", "US", "US", 
    "UK", "UK", "UK", "US"), Id = c(123L, NA, 122L, 124L, NA, 135L, 
    140L, NA)), row.names = c(NA, -8L), class = "data.frame")