I am building a Minimum Spanning Trees model, and it succeeded. I generated a plot and wanted to identify which alternative data points are connected for each data point. Is there a way to do that?
The modeling code is as below.
data(iris)
mst.mod <- ape::mst(dist(iris))
plot(mst.mod)
The tree is visualized. It looks a bit messy but I want to identify, for example, which instances are connected with instance 1 and so on. Visually, it can be seen that instance has an edge with instances 28 and 40. But is there a R code to find them all for each data point?
Yes, there is.
We can use base. Convert mst.mod
to matrix
, apply which()
to find indices where 1
occurs, and, for instance, convert to a list.
mst.mod = ape::mst(dist(iris))
unstack(as.data.frame(which(as.matrix(mst.mod)==1L, arr.ind=TRUE)))
giving
> |> head()
$`1`
[1] 5 18 28 40
$`2`
[1] 13 35 46
$`3`
[1] 48
$`4`
[1] 30 48
$`5`
[1] 1 38
$`6`
[1] 11 19
For 1
, besides 18
and 40
there are 5
and 28
.
Depending on the desired output, which(as.matrix(mst.mod )==1L, arr.ind=TRUE)
might be enough. Haven't checked the documentation/help files if there is a more direct way using {ape}
.