Complex join with nested or subqueries in sqlalchemy - find the closest date match

We have 2 tables that contain a global_id to join on. Let's call them device_updates and client_updates. Their respective tables (albeit reduced to help solve the problem) look like the following:

device_updates

update_id	global_id	created	device_date
Integer	Integer	Datetime	Datetime

client_updates

client_update_id	global_id	updated	updated_by	update_type
Integer	Integer	Datetime	Integer	String

*Note that the real table has an Integer representing the update_type. this is just for clarity here.

When the device changes state, it always updates the device_updates table. If this is an alarm state, then the client is notified and can choose to 'update' that alarm to be silenced (as an example).

What we need to be able to do is generate a history of all events from the device updates and client updates in chronological order. But since neither table is aware of each other (other than the global_id) I can't figure out how to ensure that each returned row contains the exact state at that point in time. Also, since there is a possibility of having a few thousand devices update each second, and a client can silence all with a single click, we can't add a column in the client updates table that references the device table row as it's both slow, and we were also planning to do bulk uploads to handle the speed requirements which means we can't assign a matched id.

As an example flow:

device with global_id 15 changes state
client 7 updates this to silence it a few seconds later
device with global_id 15 changes state again at some time in the future

What should be returned in an 'event_history' query is the following:

update_id	global_id	created	device_date	client_update_id	updated	updated_by	update_type
1	15	10:53:24	10:53:23	Null	Null	Null	Null
2	15	10:53:24	10:53:23	1	10:53:34	7	"Silence"
3	15	10:57:09	10:57:8	Null	Null	Null	Null

To sum up, how do I join the device_updates table to the client_updates table, but note that device updates won't necessarily have an associated client update, but all client updates will need to find the exact device update that matches the updated time (noting that the update always occurs after the device_date). Also note that there are potentially many thousands of devices, so it's not just for a single global_id.

UPDATE I forgot to add an example of a query I have tried, which is below:

db.query(
  client_updates,
  device_updates
).outerjoin(
  device_updates, device_updates.device_update_id==db.query(
    device_updates
  ).filter(
    device_updates.global_id==client_updates.global_id,
    device_updates.device_date<client_updates.updated
  ).order_by(
    device_updates.updated.desc()
  ).first().device_update_id
)

But this appears to find the same device_update_id to match everything with.

Solution

Ended up going with the Kafka solution. Super speedy results, as my test build was comfortably able to handle a very high throughput and I now have confidence in that being robust and fast.