I have the following string derived from a Bayesian Network learning algorithm (like from bnlearn
or deal
packages):
[1] "[wst|af:bq:rloss_s:pre3][af|bq][d|wst:af:con:rloss_s][bq|con][con|af][rloss_s|af:con:pre3][pre3|af:con]"
The string defines the connection between variables and the direction of the connection.
The first variable of each term in brackets ([...]
) represents a node and all variables behind |
represent the nodes that are connected in direction to the first node. These variables are sperated by :
.
I would like to transform the string into a data.frame that represents the connection between each variable. It should look like this:
> data.frame(string_table)
from to
1 af wst
2 bq wst
3 rloss_s wst
4 pre3 wst
5 bq af
6 wst d
7 af d
8 con d
9 rloss_s d
10 con bq
11 af con
12 af rloss_s
13 con rloss_s
14 pre3 rloss_s
15 af pre3
16 con pre3
I would use the graph tools here rather than string manipulation. Here is an example to illustrate
library(bnlearn)
d = clgaussian.test
m = hc(d)
So you have the string / model
bnlearn::modelstring(m)
#[1] "[A][B][C][H][D|A:H][F|B:C][E|B:D][G|A:D:E:F]"
using bnlearn
loop through to get the parents of each node
stack(sapply(nodes(m), function(x) parents(m, x)))
or use igraph
on the adjacency matrix to get the edge list
library(igraph)
as_edgelist(graph_from_adjacency_matrix(amat(m)))
EDIT:
Seems bnlearn
has a function to extract the edges
arcs(m)