We are building a system that stores purchase history of customers. Mainly 3 entities,
For 'Orders' and 'Returns' we are going to subscribe to pubsub events from another GCP project. Eventually, we need to have event log of all events and also the the current state of the 'orders' and 'returns'. In my view BigQuery would do a great job storing events, as it is mainly an append-only db and it can help in tracking events back in time. However, some users of BigQuery are only interested in the final state.
Eg:
Events | State
----------------
Order1 | created
Order1 | in_Progress
Order2 | created
Order1 | out_for_Delivery
So the current state would be
Events | State
----------------
Order1 | out_for_Delivery
Order2 | created
I could see some options as below,
Option1: Store all events in BigQuery. And for creating the final state create BQ Views on top of the table that sort by timestamp and pick the latest version
Option2: I read about Projection queries (using CQRS) which help to build the application state at any point back in time. But I am not sure how this is practically done.
Please guide me here as to which option is better and also some more details on option2 which is a standard practice for building "Event sourcing + CQRS" kind of systems.
Regards, Prasad.
If you need to filter your table and show only the last status for your events, using a View is the easiest and best option. If you have a timestamp column in your event tables you can create a view using a query like below for example:
SELECT
event,
state,
timestamp_column
FROM (
SELECT
event,
state,
timestamp_column,
ROW_NUMBER() OVER(PARTITION BY event ORDER BY timestamp_column DESC) AS rn
FROM
`your-event-table`
)
WHERE rn =1
What this query is basically doing is: