Search code examples
ralgorithmigraphbreadth-first-search

Writing a BFS Search Algorithm by Hand


I am trying to learn more about search algorithms for network graphs. To illustrate this, I created the following example.

Step 1: Suppose there are 100 countries (country_1....country_100) that are randomly connected to each other

set.seed(123)


library(igraph)


countries <- paste0("country_", 1:100)


g <- make_empty_graph(100)


num_edges <- 200
edge_list <- sample(countries, size = num_edges * 2, replace = TRUE)
edge_list <- matrix(edge_list, ncol = 2, byrow = TRUE)
g <- graph_from_edgelist(edge_list, directed = FALSE)


V(g)$name <- countries

plot(g, vertex.label.cex = 0.7, vertex.label.color = "black", vertex.label.dist = 2)

Step 2: Now, suppose 20 people (person_A...person_T) live in these countries (each country can only have at most one person - 80 of the countries will be empty):

edge_list <- as_edgelist(g)

df <- as.data.frame(edge_list)

colnames(df) <- c("from", "to")

people <- paste0("person_", LETTERS[1:20])

assignment <- sample(countries, size = length(people), replace = FALSE)
names(assignment) <- people

df2 <- data.frame(country = countries)


df2$person <- ifelse(df2$country %in% assignment, names(assignment)[match(df2$country, assignment)], "empty")

Step 3: As an optional step, we can visualize the results:

library(visNetwork)

df2$color <- ifelse(df2$person == "empty", "grey", "red")

df2$label <- ifelse(df2$person == "empty", df2$country, paste0(df2$person, "\n", df2$country))

nodes <- data.frame(id = df2$country, label = df2$label, color = df2$color)

edges <- df

visNetwork(nodes, edges) %>% 
    visInteraction(navigationButtons = TRUE)

enter image description here

My Problem: Suppose we take "person_A" - I want to find out who is the nearest person to "person_A" and which country this person lives in. I am interested in learning how to write a BFS algorithm for this problem by hand - for example: take person_A and search everyone in a radius of degree1 - if no one is found, now search everyone in a radius of degree2 ... continue until you find the first person(s).

I know how to use a pre-built implementation of this algorithm :

adj_matrix <- as_adjacency_matrix(g)

diag(adj_matrix) <- 0

shortest_paths <- shortest.paths(g)

df2_filtered <- subset(df2, person != "empty")
selected_countries <- intersect(rownames(shortest_paths), df2_filtered$country)

filtered_paths <- shortest_paths[selected_countries, selected_countries]

item = df2[df2$person %in% c("person_A"), ]

#answer (exclude distance = 0, i.e. the same country itself)
sort(filtered_paths[rownames(filtered_paths) == item$country, ])[2]

Can someone please show me how I could write a search algorithm (by hand) to accomplish this task which starts with a person's name - and then prints the results of the search at each step until a person is found?


Solution

  • Update

    I don't think you are looking for BFS but just the ego networks defined by the nearest valid neighbors. Probably you can try the code below

    f <- function(p, df = df2, graph = g) {
        v <- df %>%
            filter(person == p) %>%
            select(country) %>%
            pluck(1)
    
        nulls <- df2 %>%
            filter(person == "empty") %>%
            select(country) %>%
            pluck(1)
    
        pers <- df2 %>%
            filter(person != "empty")
    
        d <- distances(graph, v, setdiff(names(V(graph)), c(v, nulls)))
        split(pers$person[match(colnames(d), pers$country)], d)
    }
    

    such that

    > f("person_A")
    $`2`
    [1] "person_N" "person_K"
    
    $`3`
    [1] "person_C" "person_E" "person_I"
    
    $`4`
    [1] "person_P" "person_M" "person_Q"
    
    $`5`
    [1] "person_J" "person_S" "person_O" "person_T" "person_G" "person_F" "person_R"
    [8] "person_H"
    
    $`6`
    [1] "person_L"
    
    $`7`
    [1] "person_B"
    
    
    > f("person_B")
    $`2`
    [1] "person_F"
    
    $`3`
    [1] "person_S" "person_G"
    
    $`4`
    [1] "person_J" "person_P" "person_H"
    
    $`5`
    [1] "person_M" "person_N" "person_K" "person_T" "person_R" "person_E" "person_L"
    
    $`6`
    [1] "person_Q" "person_O" "person_C" "person_I"
    
    $`7`
    [1] "person_A"
    
    
    > f("person_C")
    $`1`
    [1] "person_N"
    
    $`3`
    [1] "person_P" "person_M" "person_A" "person_K"
    
    $`4`
    [1] "person_J" "person_Q" "person_O" "person_G" "person_R" "person_H" "person_E"
    [8] "person_I"
    
    $`5`
    [1] "person_S" "person_T" "person_F" "person_L"
    
    $`6`
    [1] "person_B"
    

    Previous

    Not sure about your motivation and/or necessity of implementing the BFS algorithm by yourself, but I guess it is worthy to try bfs from igraph (since you are using igraph anyway in your question)

    f <- function(p, df = df2, graph = g) {
        v <- df %>%
            filter(person == p) %>%
            select(country) %>%
            pluck(1)
    
        df %>%
            arrange(match(country, names(bfs(graph, v)$order))) %>%
            filter(person != "empty") %>%
            slice(2) %>%
            select(person) %>%
            pluck(1)
    }
    

    such that

    > f("person_A")
    [1] "person_N"
    
    > f("person_B")
    [1] "person_F"
    
    > f("person_C")
    [1] "person_N"