graph context-free-grammar rascal connected-components

How to get the strongly connected components of a Rascal context-free grammar

as the question states, I would like to be able to convert a grammar to a set of strongly connected (non-terminal only) components. I want to do this by constructing a graph from the grammar and then call the connected-components function. Which brings me to the real problem: how to construct this directed graph that has edges of the type:

For each A -> xBz where A,B in Non-terminals (N) and x and y in (sigma U N)*:
    construct an edge from A to B

Is something that does similar things to this build in, or would I have to completely implement this myself? If so, could you help me get started, by for example showing how to get only terminals, non-terminals and production rules from a grammar? Instead of only by machine parseable grammar-data structure?

If there is a better way to find the components instead of constructing the graph that would be a great answer as well of course. I hope this is clear enough, if not, just let me know!

Edit: The algorithm is relatively easy I think, I just don't see how to do this in Rascal. Here is a picture of the algorithm in pseudo-code. Here grammar.V are its non-terminals and P its production rules (different definition, so different naming :s)

Solution

First get a grammar for your syntax like so:

import Grammar;
gr = grammar(#YourTopNonterminal);

Then you could use this library module (with example code on how to extract dependencies):

import analysis::grammars::Dependency;
deps = symbolDependencies(gr);

And you'd get a binary relation between dependent symbols like this:

rascal>symbolDependencies(g)
Graph[Symbol]: {
  <sort("A"),sort("B")>,
  <sort("B"),sort("C")>
}

The basic code for symbolDependencies is this:

Graph[Symbol] symbolDependencies(Grammar g) =
  { <from,to> | /prod(Symbol from,[_*,Symbol elem,_*],_) := g, /Symbol to := elem}

The comprehension loops over all rules of the grammar, takes the head from and then finds all symbols to in the rule (possibly nested due to regular expressions) and creates a tuple for each pair.

After that you'd start analyzing and transforming this relation to get the strongly connected components. The library module analysis::graphs::Graph has an example function which computes connected components (not strongly connected components, so you'd have to adapt that).

rascal>import analysis::graphs::Graph;
ok
rascal>connectedComponents(symbolDependencies(g))
set[set[Symbol]]: {{
    sort("A"),
    sort("C"),
    sort("B")
  }}

Finally, to print a symbol like sort("A") back to a pretty name can come in handy, especially if you have regular expressions over non-terminals (like * and +):

rascal>t = type(sort("A"),());
type[value]: type(
  sort("A"),
  ())
rascal>"<t>"
str: "A"

I'd also recommend visualizing the graphs using viz::Figure