I'm trying to figure out the data model and access patterns for an app keeping track of animal movements between different fields (pastures). There are movement records that look like this:
PK FROM TO DATE
------------------------------------------------------
ANIMAL#001 FIELD#A FIELD#B January 3
ANIMAL#001 FIELD#Q FIELD#R September 19
ANIMAL#002 FIELD#A FIELD#B January 3
ANIMAL#003 FIELD#C FIELD#D March 15
ANIMAL#005 FIELD#F FIELD#A April 22
For a specific field, e.g. FIELD#A
, I'd like to know all the movements into and out of that field, the date of the movement, and the number of animals. The results should look like:
DATE FROM TO NUMBER_ANIMALS
--------------------------------------------------
January 3 FIELD#A FIELD#B 2
April 22 FIELD#F FIELD#A 1
Possible solutions and attempts:
A GSI with PK=FROM, SK=TO
. If I query the GSI with PK=FIELD#A
, this only gives one half of the picture, that is, movements from FIELD#A
. I can't obtain movements to FIELD#A
.
A composite attribute like FIELD#A#FIELD#B
used as the PK
in a GSI. Runs into the same problem as attempt 1.
Two GSI. GSI1 has PK=FROM
and GSI2 has PK=TO
. I can query GSI1 with PK=FIELD#A
and do some post-processing (groupby, count
) to get part of the result. I can then query GSI2 with PK=FIELD#A
and post-process, getting the rest of the result. This looks like it will work but requires two GSI and two queries. I can't overload one GSI since both columns in use are from the same item.
Some combination of scanning the entire table and filtering the results which I'd rather avoid since there might be 50,000+ items in the entire table.
I can see how to do it with two GSIs, but what's the most efficient way?
I could imagine a slightly different table structure (ANIMALID
being the partition key and FIELDID
being the sort key):
ANIMALID | FIELDID | FROM_TO | ...
——————————————————————————————————
ANIMAL#1 | FIELD#A | FROM | ...
ANIMAL#1 | FIELD#B | TO | ...
ANIMAL#2 | FIELD#C | FROM | ...
ANIMAL#2 | FIELD#A | TO | ...
And a GSI with the following structure:
FIELDID | ANIMALID | ...
Then you can query the GSI just by FIELDID
and aggregate the results.