I have a data frame which is as below :
h = data.frame(fr = c('A','A','X','E','B','W','C','Y'),
t = c('B','E','Y','C','A','X','A','W'),
Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))
I've found all the possible vertex sequence that starts with the smallest vertex number by using r igraph find all cycles as reference. And the result is as below :
[[1]]
A E C A
1 3 6 1
[[2]]
A B A
1 4 1
[[3]]
X Y W X
2 7 5 2
Now I'd like to
calculate the sum from each cycles
numbers of edges in each cycles
It'd be like this :
A - B - A : 40 + 33 = 73 ; numbers of edges : 2
A - E - C - A : 30 + 10 + 21 = 61 ; numbers of edges : 3
X - Y - W - X : 55 + 90 + 78 = 223 ; numbers of edges : 3
Does anyone have any ideas to use R to calculate? That would be great appreciation !!
FURTHER EDIT PART
Thanks to the reply, I can calculate two items above !! However, I got a tiny problem here !!
I don't know what the problem I got so that I cannot calculate correctly !! Even I modify many times.
It should be like this :
[[1]] [[2]] [[3]]
A E C A A B A X Y W X
Path sumAmt numberOfEdges
<fct> <dbl> <int>
1 "A - B - A" 73 2
2 "A - E - C - A" 61 3
3 "X - Y - W - X" 223 3
But After I put in my code, it cannot show up the first node :
[[1]] [[2]] [[3]]
E C A B A Y W X
Path sumAmt numberOfEdges
<fct> <dbl> <int>
1 " - B - A" 33 2
2 " - E - C - A" 31 3
3 " - Y - W - X" 168 3
Here's my code on finding cycles. Does anything I miss to put-in ??
h = data.frame(fr = c('A','A','X','E','B','W','C','Y'),
t = c('B','E','Y','C','A','X','A','W'),
Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))
library(igraph)
g <- graph.data.frame(h, directed = TRUE)
Cycles = NULL
for(fr in V(g)) {
for(t in neighbors(g, fr, mode = "out")) {
Cycles = c(Cycles,
lapply(all_simple_paths(g, t, fr, mode = "out"), function(p)c(fr,p)))
}
}
LongCycles = Cycles[which(sapply(Cycles, length) > 1)]
LongCycles[sapply(LongCycles, min) == sapply(LongCycles, `[`, 1)]
Does anyone have ideas? That would be helpful !!
There's probably a shorter way, but provided your data is as follows (where h
is your table with amounts, and all_cycles
list with cycles) -
h = data.frame(fr = c('A','A','X','E','B','W','C','Y'),
t = c('B','E','Y','C','A','X','A','W'),
Amt = c( 40, 30, 55, 10, 33, 78, 21, 90))
all_cycles <- list(
c(A = 1, E = 3, C = 6, A = 1),
c(A = 1, B = 4, A = 1),
c(X = 2, Y = 7, W = 5, X = 2)
)
.. you could do:
library(dplyr)
data.frame(
Nodes = unlist(lapply(all_cycles, names)),
Path = unlist(lapply(seq_along(all_cycles),
function(x) rep(paste(names(all_cycles[[x]]), collapse = " - "),
length(all_cycles[[x]]))))
) %>%
group_by(Path) %>%
mutate(fr = Nodes, t = lead(Nodes)) %>%
left_join(h) %>%
summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))
To get:
# A tibble: 3 x 3
Path sumAmt numberOfEdges
<fct> <dbl> <int>
1 A - B - A 73 2
2 A - E - C - A 61 3
3 X - Y - W - X 223 3
In case first value is always unnamed in the elements of your list, you could do:
data.frame(
Nodes = unlist(lapply(all_cycles, names)),
id = unlist(lapply(seq_along(all_cycles),
function(x) rep(x, length(all_cycles[[x]])))), stringsAsFactors = FALSE
) %>%
group_by(id) %>% mutate(Nodes = replace(Nodes, Nodes == "", last(Nodes)),
Path = paste(Nodes, collapse = " - ")) %>%
mutate(fr = Nodes, t = lead(Nodes)) %>%
group_by(Path, id) %>%
left_join(h) %>%
summarise(sumAmt = sum(Amt, na.rm = TRUE), numberOfEdges = sum(!is.na(t)))