Search code examples
rsocial-networkingnetwork-analysisstatnet

Setting up a statnet model in R


I would like to simulate exponential family random graphs, and I just started learning to use the statnet and ergm R packages. From the tutorial I found online, I am able to learn an ERGM model from an example dataset:

# install.packages('statnet')
# install.packages('ergm')
# install.packages('coda')

library(statnet)

set.seed(123)

data(package='ergm') # tells us the datasets in our packages
data(florentine) # loads flomarriage and flobusiness data

# Triad model
flomodel <- ergm(flomarriage ~ edges + triangle) 
summary(flomodel)

Currently, I would like to use the simulate command to simulate networks with a pre-specified number of nodes from a pre-specified formula (that is not learned from any particular dataset), for example, P(y) = 1/Z exp(a * num_edges + b * num_triangles), where a and b are user-specified coefficients. How should I go about writing such a model in statnet?


Solution

  • You can simulate from a given formula with simulate (or simulate.formula):

    simulate(flomarriage ~ edges + triangles, coef = c(3,1))
    

    To fix a simulation to have the same number of edges as the given graph (flomarriage in this case)

    simulate(flomarriage ~ edges + triangles, coef = c(3,1), constraints = ~edges)
    

    Not every constraint you might want to apply is available since each requires a specific mcmc sampler, but for a list of what is available see ?ergm.constraints

    To fix the simulation to have an arbitrary number of nodes and edges (not based on an observed data) a workaround is to create such a network first. For example, to simulate over networks with 17 nodes and 16 edges.

    test.mat = matrix(0, 17, 17)
    test.mat[1,] = 1 #adds 16 edges
    test.net = as.network(test.mat, directed = F)
    test.sim = simulate(test.net ~ triangles, coef = 1, constraints = ~edges)
    summary.statistics(test.sim ~ edges() + triangles())
    

    p.s. I don't recommend using the triangles term in ERGM models. The geometrically weighted terms (gwesp, gwdsp) are the best substitutes which are more stable.