I have a dataframe df
that contains data on edge weights between two pairs of nodes:
df <- data.frame(c("A","A","B","B","C","C"),
c("B","C","A","C","A","B"),
c(2,3,6,4,9,1))
colnames(df) <- c("node_from", "node_to", "weight")
print(df)
# Output:
node_from node_to weight
1 A B 2
2 A C 3
3 B A 6
4 B C 4
5 C A 9
6 C B 1
I would like to contract this dataframe by merging nodes A and B and summing all edge weights to and from these nodes with any other node, in this case C only. The result should be an edge list where the edges between A and B have disappeared and AB is now one node:
# some code to merge nodes A and B
print(df_contracted)
# Output:
node_from node_to weight
1 AB C 7
3 C AB 10
Is there a way to do this efficiently for larger dataframes?
I could convert the dataframe to an actual graph using graph_from_data_frame
from the igraph
package and then the contract
function, but given that I have to do this operation multiple times I'd rather not have to convert it then reconvert it back every time.
Here's a dplyr
solution:
library(dplyr)
to.merge <- c('A', 'B')
merged.name <- paste(to.merge, collapse='')
df %>%
mutate(across(c(node_from, node_to),
~ if_else(.x %in% to.merge, merged.name, .x))) %>%
group_by(node_from, node_to) %>%
summarise(weight = sum(weight), .groups = "drop") %>%
filter(node_from != node_to)
# # A tibble: 2 × 3
# node_from node_to weight
# <chr> <chr> <dbl>
# 1 AB C 7
# 2 C AB 10
It changes all from and to node names that are "A" or "B" to "AB", groups rows with the same combination of from_node
and to_node
, sums weights within these groups, and finally removes the AB<->AB self-loop.