Search code examples
apache-kafkaapache-kafka-streams

Kafka- Joining a Kafka Stream with a Global K Table using non key value


Ive seen multiple posting that state its possible to join a kafka stream with a global k table using record values instead on keys on the global k table https://kafka.apache.org/20/documentation/streams/developer-guide/dsl-api.html#kstream-globalktable-join

They allow for joining against foreign keys; i.e., you can lookup data in the table not just by the keys of records in the stream, but also by data in the record values.

Does that mean its possible to join a stream with a global k table using a record value of the global k table rather than the key of the global k table.

For Ex:

I have two objects / tables Order --> orderId, order amount OrderDetail --> orderDetailid, orderId, qty, price

Order will be converted in to a kafka stream with key orderId OrderDetail will be converted in to global k table with key orderDetailId

OrderId is a foreign key in OrderDetail

Is it possible to perform a join on Stream(Order) and GlobalKTable(OrderDetail) with a non key value of OrderDetail ie: A join on Order.orderId with OrderDetail.OrderId. The intention here is to retrieve a list of Orders with all their Order Details

I looked at KStreamKTableJoinProcessor and noticed the process() method always searched for the key on the global k table. I know its possible to select the key to be used on the Left Side (KStream), but is it possible to select a record value as the key from the right side (Global K Table) while performing the join.

One solution would be to recreate the global k table with orderId as the new key , but I dont want that as that would only create one orderId value in the Global K Table. Im trying to fetch a one-many relationship


Solution

  • To answer your question - no it's not possible to do a KStream-GlobalKTable join with a value from the table side, you can only use the value from the stream to match against the key of the GlobalKTable