I'm learning Neo4j, and as a test project I'm mapping out my World of Warcraft guilds for all of my characters with data retrieved from dev.battle.net.
The obvious labels are Character and Guild, but some others are difficult to tell. Each Character has a Class, each Class has 2-4 Specializations, and each specialization fills one of 3 roles (Tank, Healer, Damage).
My test queries are...
My assumption is that Role, Specialization, Class, and Realm should also be labels, with specifics attached to the relationship.
MATCH (:Tank)-[:RoleFor]->(s:Spec)<-[h:Has]-(c:Character)-[:Member]->(g:Guild)
WHERE h.ItemLevel > 900 and g.Name like 'TestGuild1'
RETURN c,h,s
Instead Of...
MATCH (c:Character)
Where (c.Class like "Druid" and c.Spec3ItemLevel > 900) Or (c.Class like "Death Knight" and c.Spec1ItemLevel > 900)
return c
I think the first one is going to be faster unless it has to load ALL of the relationships from (:Spec)<-[]-(:Character)
before it can filter them down. It at least looks cleaner from a query perspective. Can anyone confirm this?
There are millions of characters, thousands of guilds, probably more than 100 guilds, but a much smaller number of classes (12) and specs (36).
If we assume that your 2 examples (which contain some syntax errors: like
is not a Cypher operator) are representative of all your use cases, then a data model like your first one seems appropriate. You just need to add the appropriate indexes (or uniqueness constraints) to avoid scanning a lot of data to kick off your queries.
Here are modified queries that correspond to your 2 examples:
MATCH (:Tank)-[:ROLE_FOR]->(s:Spec)<-[h:HAS_SPEC]-(c:Character)-[:MEMBER_OF]->(g:Guild)
WHERE h.itemLevel > 900 AND g.name = 'TestGuild1'
RETURN c, h, s;
MATCH (s:Spec)<-[h:HAS_SPEC]-(c:Character)-[:HAS_CLASS]->(cl:Class)
WHERE s.spec3ItemLevel > 900 AND cl.name IN ['Druid', 'Death Knight']
RETURN c;
You should probably have indexes for at least:
:Guild(name)
:Class(name)
:Spec(spec3ItemLevel)