Search code examples
postgresqlapache-kafkaredisetlapache-kafka-connect

Redis Pub/Sub - ETL -> Postgres


I have a simple task:

  1. Subscribe to messages on Redis channel

  2. Transform message, e.g.

    HASH: '<user_id>|<user_type>|<event_type>|...' with items: { 'param_1': 'param_1_value', 'param_2': 'param_2_value', ... } into tabular form

user_id event_type param_1 param_2 ...
<user_id> <event_type> cleaned(param_1_value) cleaned(param_2_value) ...
  1. Append to an existing table in Postgres

Additional context:

  1. The scale of events is rather small
  2. Refreshments must be done at most every ~15 minutes
  3. Solution must be deployable on premises
  4. Using something else as a queue than Redis is not an option

The best solution I came up with is to use Kafka, with Kafka Redis Source Connector (https://github.com/jaredpetersen/kafka-connect-redis) and then Kafka Postgres Sink Connector (https://github.com/ibm-messaging/kafka-connect-jdbc-sink). It seems reasonable, but the task seems like generic Redis to Postgres ETL and I'm wondering if there is really no easier out of the box solution out there.


Solution

  • You could just write a script and execute it via cron. But take a look at the Benthos project as you can easily run it on prem and what you describe can be done entirely via configuration for Redis -> Postgres.