Search code examples
rfilterdplyrstandard-evaluation

R dplyr resolve variable in conditional filter


I am trying to filter based on a variable value, and have tried multiple combinations of filter_, dots and quotes to no avail.

As an example, I have a

runlist = c(1, 2, 3, 4, 5) 

and a dataframe boo

run <- rep(seq(5), 3)
edge1 <- sample(20, 15)
edge2 <- sample(20, 15)
weights <- sample(50, 15)
boo <- as.data.frame(cbind(run, edge1, edge2, weights))

and I want to filter a dataframe named boo which may look something like iteratively as

for (i in runlist) {
    bop <- boo %>% filter( run == i )
    str(boo)
}

I suspect I'll be hearing about not using for loops and R, rather use group_by(run), but I'm sending this data to igraph and need to further subset the dataset to just edges and weights, thus losing the grouping variable, as in

bop <- boo %>% filter( run == i ) %>% select( edge1, edge2, weights )

I will create a network graph and find density and centrality values for each run.

bing <- graph.data.frame(bop)

How do I get the i in the conditional filter to resolve as the correct index?


Solution

  • My answer is not about "resolving a variable in a conditional filter", but there's a much easier way to do what you want to do.

    The big idea is to split the data frame based on the variable run, and map a function onto each of those pieces. This function takes a piece of the data frame and spits out an igraph.

    The following code accomplishes the above, storing a list of graphs in the column graph. (It's a list-column, see more at the R for data science book)

    boo %>%
      group_by(run) %>%
      nest() %>%
      mutate(graph = map(data, function(x) graph.data.frame(x %>% select(edge1, edge2, weights)))) %>%
      mutate(density = map(graph, function(x) graph.density(x))