postgresql scala apache-spark spark-structured-streaming

sSpark structured streaming PostgreSQL updatestatebykey

How to update state of OUTPUT TABLE by Spark structured streaming computation triggered by changes in INPUT PostgreSQL table?

As a real life scenario USERS table has been updated by user_id = 0002, how to trigger Spark computation for that user only and write / update results to another table?

Solution

Although there is no solution out of the box you can implement it the following way.

You can use Linkedin's Databus or other similar tools which mines the databse logs and produce respective events to kafka. The tool tracks the changes in database bin logs. You can write a kafka connector to transform and filter data. You can then consume events from kafka and process them to any sink format you want.

how to store PostgreSQL jsonb using SpringBoot + JPA?
How to write this SQL query to find each account balance?
Filter out rows in 1:N LEFT JOIN where any row in child table fails condition
SQL Formatting a string (name)
Postgres Query using concat and LIKE
Autoincrement separately for each foreign key
error connecting external database with rails
Forward fill NULL values in multiple columns
PostgreSQL EXISTS in user-defined function always returning true
Problem with creating server in pgadmin 4, unable to connect to server
Oracle JSON_TABLE to PostgreSQL - how to search from the second hierarchical key in a JSON column
pg_upgrade on Windows cannot write to log file pg_upgrade_internal.log
SQLAlchemy Many-to-Many using PostgreSQL's on_conflict_do_update doesn't get committed
Why is my postgres constraint name suffix different from the standard?
I want to restore the database with a different schema
Postgres settings to maximize compute resources for a single local connection
Why does an operation on real and a numeric inputs result in a double precision?
Typeorm queryBuilder select only a specific key value of a JSONB type column
How to set `lock_timeout` on a PostgreSQL connection with SQLAlchemy and psycopg2?
How to calculate DATE Difference in PostgreSQL?
Missing FROM-clause entry for a table
PostgreSQL 17: pg_upgrade on Windows Server 2019: could not load library "$libdir/adminpack": No such file or directory
Displaying binary images on JSP page
Recursive CTE in Postgresql
Add missing dates in a table
Cast value to type (TEXT) using string representation of type ("TEXT")
Slow join query on large table using Postgres
Equivalent of FOUND_ROWS() function in Postgresql
django.db.utils.ProgrammingError: column "role" of relation "APP_profile" does not exist
How to convert SQLite SQL dump file to PostgreSQL?