I have a SQL table like so:
Update: I'm changing the example table as the existing hierarchical nature of the original data (State, Cities, Schools) is overshadowing the fact that a simple relationship is needed between the items.
entities
id name
1 Apple
2 Orange
3 Banana
4 Carrot
5 Mushroom
I want to define two-way relationships between these entities so a user viewing one entity can see a list of all related entities.
The relationships are defined by an end user.
What is the best way to represent these relationships in the database and subsequently query and update them?
One way as I see it...
My instinct says a relationship table like so:
entity_entity
entity_id_a entity_id_b
1 2
5 1
4 1
5 4
1 3
That being the case, given a supplied entity_id of 4, how would one get all related records, which would be 1 and 5?
Likewise a query of entity_id = 1 should return 2, 3, 4, and 5.
Thanks for your time and let me know if I can clarify the question at all.
Define a constraint: entity_id_a < entity_id_b
.
Create indexes:
CREATE UNIQUE INDEX ix_a_b ON entity_entity(entity_id_a, entity_id_b);
CREATE INDEX ix_b ON entity_entity(entity_id_b);
Second index doesn't need to include entity_id_a
as you will use it only to select all a
's within one b
. RANGE SCAN
on ix_b
will be faster than a SKIP SCAN
on ix_a_b
.
Populate the table with your entities as follows:
INSERT
INTO entity_entity (entity_id_a, entity_id_b)
VALUES (LEAST(@id1, @id2), GREATEST(@id1, @id2))
Then select:
SELECT entity_id_b
FROM entity_entity
WHERE entity_id_a = @id
UNION ALL
SELECT entity_id_a
FROM entity_entity
WHERE entity_id_b = @id
UNION ALL
here lets you use above indexes and avoid extra sorting for uniqueness.
All above is valid for a symmetric and anti-reflexive relationship. That means that:
If a is related to b, then b is related to a
a is never related to a