Search code examples
neo4jspark-graphx

Querying large Hierarchical


Organisation dealing with HR data (60 GB+ every day).

How to query Organisation hierarchical data in efficient manner. Suppose want to query -

a) At which level, a person is there in an organisation tree? b) How many direct reportees and indirect reportees are there for a person e.g. A has 2 direct reportees (B and C) and B/C has 10 direct reportees each. Then in this case, Total indirect reportees for A = 20 and Total reportees for A = 22

Which framework will be best for this? Should we go for Neo4j which provides Cypher Query Language, Spark GraphX, Spark GraphDF etc.?

Some quick example code will help a lot.


Solution

  • Use cypher for both -

    a) To find out where the employee is in the organization relative to the top boss:

    MATCH (e:Employee {empid: "ID"})-[r:REPORTS_TO*]->(boss:Employee)
    return e, r, boss
    

    b) To find the employees that are direct and indirect reports of an employee:

    MATCH (e:Employee {empid: "ID"})<-[r:REPORTS_TO*1..2]-(sub:Employee)
    return e, r, sub