Search code examples
neo4jgoogle-analytics

Neo4j Web Analytics model


I have to query the number of user who have gone from one page to another from my graph database and provide it to the UI which would be similar to this image.I am not concerned about the view but i am responsible for the backend data model and queries.I would like the depict the flow of paths followed by a user using Neo4j. I want to produce an output like a sankey diagram. The project I am doing requires a similar output like Google Analytics. Is using Neo4j a feasible option?

I have currently thought of storing the URL as nodes and the relationship between two nodes as relationship with type "PATH". The traversed path is stored as relationship property with a value.

For example if a user moves from A->B->C->D then the relationship between A and B will have the property A_B:1 and the relationship between B and C will have a relationship property AB_C:1.Relationship between C and D will have a relationship property as ABC_D:1.

Does anyone have their own input to this? Is this a good model?


Solution

  • If you were to use Neo4j, it certainly makes sense to use nodes to model states and relationships to model the flows between states.

    But it is a very bad idea for you to try to maintain real-time flow information as extra properties in the data, since the real-time flow can be be gotten directly from the nodes and relationships without any extra data. Adding your own real-time flow properties would be totally redundant, add complexity, waste time and storage, and be vulnerable to race conditions (and therefore would not be reliable).