Search code examples
orientdbgremlin

Gremlin: Calculate division of based on two counts in one line of code


I have two counts, calculated as follows:

1)g.V().hasLabel('brand').where(__.inE('client_brand').count().is(gt(0))).count()

2)g.V().hasLabel('brand').count()

and I want to get one line of code that results in the first count divided by the second.


Solution

  • Here's one way to do it:

    g.V().hasLabel('brand').
      fold().as('a','b').
      math('a/b').
        by(unfold().where(inE('client_brand')).count())
        by(unfold().count())
    

    Note that I simplify the first traversal to just .where(inE('client_brand')).count() since you only care to count that there is at least one edge, there's no need to count them all and do a compare.

    You could also union() like:

    g.V().hasLabel('brand').
      union(where(inE('client_brand')).count(),
            count())
      fold().as('a','b').
      math('a/b').
        by(limit(local,1))
        by(tail(local))
    

    While the first one was a bit easier to read/follow, I guess the second is nicer because it only stores a list of the two counts whereas, the first stores a list of all the "brand" vertices which would be more memory intensive I guess.

    Yet another way, provided by Daniel Kuppitz, that uses groupCount() in an interesting way:

    g.V().hasLabel('brand').
      groupCount().
        by(choose(inE('client_brand'),
                    constant('a'),
                    constant('b'))).
      math('a/(a+b)')
    

    The following solution that uses sack() step shows why we have math() step:

    g.V().hasLabel('brand').
      groupCount().
        by(choose(inE('client_brand'),
                    constant('a'),
                    constant('b'))).
      sack(assign).
        by(coalesce(select('a'), constant(0))).
      sack(mult).
        by(constant(1.0)). /* we need a double */
      sack(div).
        by(select(values).sum(local)).
      sack()
    

    If you can use lambdas then:

    g.V().hasLabel('brand').
      union(where(inE('client_brand')).count(),
            count())
      fold().
      map{ it.get()[0]/it.get()[1]}