Search code examples
rdata-manipulationspatial

Rowwise duplicate to missing for second degree neighbors


I am probably just not hitting the right search terms, but I would like to delete entries (set to NA) if this entry appears before in the same row.

Starting from df I want to get to df2.

df <- data.frame(t(data.frame(c("Ashanti","Brong Ahafo","Central","Eastern","Western",NA,
                 "Ashanti","Eastern","Northern","Volta","Western"),
                 c("Brong Ahafo","Ashanti","Eastern","Northern","Volta",
                   "Western","Brong Ahafo","Central","Eastern","Western",NA))))
rownames(df) <- NULL
names(df) <- c("id","nbr_1","nbr_2","nbr_3","nbr_4","nbr_5","scdnbr_1",
               "scdnbr_2","scdnbr_3","scdnbr_4","scdnbr_5")

df2 <- data.frame(t(data.frame(c("Ashanti","Brong Ahafo","Central","Eastern","Western",NA,
                  NA,NA,"Northern","Volta",NA),
                  c("Brong Ahafo","Ashanti","Eastern","Northern","Volta","Western",NA,
                    "Central",NA,NA,NA))))
rownames(df2) <- NULL
names(df2) <- c("id","nbr_1","nbr_2","nbr_3","nbr_4","nbr_5","scdnbr_1",
                "scdnbr_2","scdnbr_3","scdnbr_4","scdnbr_5")

Probably not necessary, but the applied context is to get the second order neighboring regions within Ghana with the poly2nb command

pacman::p_load("spdep","sp","expp","raster","dplyr","tidyr")
ghana <- getData('GADM', country='GHA', level=1)


# first degree neighbors
nb <- poly2nb(ghana, row.names=ghana$NAME_1)
nb <- neighborsDataFrame(nb)

nb <- nb%>% group_by(id) %>% mutate(nbr = sequence(n())) %>% 
  spread(key = nbr, value = id_neigh, sep="_")

# second degree neighbors

nb2_2 <- nb2 %>% 
  rename(scdnbr_1=nbr_1,
         scdnbr_2=nbr_2,
         scdnbr_3=nbr_3,
         scdnbr_4=nbr_4,
         scdnbr_5=nbr_5)

nb3 <- nb2 %>% 
  left_join(nb2_2, by=c("nbr_1"="id"))

I would then continue joining in the second degree neighbors for the four remaining first neighbors. But before that step I would like to achieve what I described above (as in df to df2).

Thank you all!


Solution

  • To get the desired output we could do:

    df1 <- t(apply(df, 1, function(x) replace(x, duplicated(x), NA)))
    
    x <- df1 %>% 
      as_tibble() %>% 
      pivot_longer(
        everything()
      ) %>%
      group_by(value) %>% 
      mutate(id = row_number()-1,
             value = paste0("X.",value,"."),
             value = ifelse(value == "X.NA." & id > 0, paste0(NA, "..", id), value),
             value = ifelse(value == "X.NA.", NA, value)) %>% 
      select(-id) %>% 
      mutate(value = str_replace(value, " ", ".")) %>% 
      pivot_wider(
        names_from = name,
        values_from = value
      )
    
    colnames(df1) <- x
    
    df1
    
         X.Ashanti. X.Brong.Ahafo. X.Central. X.Eastern. X.Western. <NA> NA..1 NA..2 X.Northern. X.Volta. NA..3
    [1,] "Ashanti"  "Brong Ahafo"  "Central"  "Eastern"  "Western"  NA   NA    NA    "Northern"  "Volta"  NA