Search code examples
amazon-web-servicesamazon-dynamodb

Global Secondary Indexes or manual replication in DynamoDB


This is more an architectural question than other. I'm currently creating the data model part for a booking feature, and as a database I'm using Dynamo. Each user can book a spot in a SPA. I'm using the single table design and:

  • SPAs have PK SPA#spaID (SKs define other SPA's entities like employees, rooms etc.)
  • Users have PK USER#userID (SKs define other user entities, like profile, orders and so on)

Now the booking entity will have the usual booking data + spaId and userId which will need to be searchable with: I need to be able to get all bookings for a user and all bookings for a spa.

The solution of course would be to have the same booking being replicated on two PKs: SPA#spaId BOOKING#bookingID (as SK) and USER#ID BOOKING#bookingID (as SK)

I can see this would be achievable using Dynamo's Global Secondary Indexes having one single entity:

  • PK: SPA#spaID
  • SK: BOOKING#bookingID
  • GSIPK1: USER#userId
  • GSISK1: BOOKING#bookingID

This would mean that updating one single PK will then automatically update the Global Secondary Index replication automatically, if I understood correctly

But I'm not sure this is the right approach. On many single table design videos I saw architectures manually replicating the data, so updating first [PK SPA# SK BOOKING#] and then [PK USER# SK BOOKING#]

Is there any reason why I should follow the second path? Are there any limitations on the GSIs that I'm not seeing right now?


Solution

  • You should use a Global secondary index. The use cases which you have seen updating two different entities are because they are not updating the same item. In your case all of the information is contained in a single item, so you can simply let DynamoDB replicate the data to your index to fulfill your access patterns.

    PK SK GSI1PK
    SPAId BookingId UserId

    The GSI would simply have GSI1PK as the partition key and use SK as the sort key. You could alternatively use a timestamp as the sort key to get a user's bookings ordered by time which may be useful.

    Where you may need to incorporate transactions is when you have to obtain the availability from the SPA. You may only have a limited number of slots of availability for any given time period, so when a user books a slot, you have to ensure that only one user can obtain it. However, that is outside of your current concern.