Update null values in a column with values from same table but by referencing indirectly related matching value in another table

I need to write a Snowflake SQL to update a null value(cid) in a table transaction from cid values in the same table but by referencing indirectly related matching value(hash) in another table payment.

Please refer the attached sample data & expected result for high level view. Also added insert queries now. I've tried explaining the data below:

Suppose, I have two tables in snowflake - transaction and payment.
transaction table has columns - tid, cid, revenue1, revenue2.
- tid is primary key. cid can be null.
payment table has columns - id, tid, hash, amount.
- id is primary key.
- tid is always not null but not unique
- hash can be null.

Rules:

not all tid in transaction are present in payment.
for some tid in transaction table there can be a hash in the payment table.
other tid can be present in payment table but with null hash.

Example: Please refer the attached sample data and expected result. let's say there is a tid 123 with null cid in transaction but has a corresponding hash in the payment table. the same hash value in payment table is also mapped with transaction tid 770. Now this tid 770 back in transaction table has its cid populated as cid_2. we want to use this situation so that cid value cid_2 of tid 770 gets copied to cid value of tid 123 in transaction table due to their common hash in payment table.

Question: I need to write an snowflake query for this to populate all such null cid in transaction table wherever possible. some cid might still remain blank in transaction table if a matching hash is not found in payment table for their respective tid

Sample data:

Transaction:

TID	CID	revenue1	revenue2
123	null	1313121	1309075
456	null	1312312	1309076
789	null	1311503	1309077
312	null	1310694	1309078
345	null	1309885	1309079
678	null	1309076	1309080
679	null	1309077	1309081
770	cid_2	309885	1309079
880	cid_1	1309076	1309080
234	null	1309885	1309079

Payment:

ID	TID	hash	Revenue
23423	123	3333	56
34534	456	1111	65
23423	789	2222	67
32432	880	1111	87
54754	770	3333	45
98797	312	null	65
14284	234	3333	54

Expected Output for final transaction table: (The bold values are new and achieved through payment's hash reference.

TID	CID	revenue1	revenue2
123	cid_2	1313121	1309075
456	cid_1	1312312	1309076
789	null	1311503	1309077
312	null	1310694	1309078
345	null	1309885	1309079
678	null	1309076	1309080
679	null	1309077	1309081
770	cid_2	1309885	1309079
880	cid_1	1309076	1309080
234	cid_2	1309885	1309079

Insert queries for transaction & payment:

CREATE TABLE transaction (
    TID INT PRIMARY KEY,
    CID VARCHAR(255),  -- Assuming CID is a string, adjust type as needed
    revenue1 INT,
    revenue2 INT
);

INSERT INTO transaction (TID, CID, revenue1, revenue2)
VALUES
(123, NULL, 1313121, 1309075),
(456, NULL, 1312312, 1309076),
(789, NULL, 1311503, 1309077),
(312, NULL, 1310694, 1309078),
(345, NULL, 1309885, 1309079),
(678, NULL, 1309076, 1309080),
(679, NULL, 1309077, 1309081),
(770, 'cid_2', 1309885, 1309079),
(880, 'cid_1', 1309076, 1309080),
(234, NULL, 1309885, 1309079);

------------------------------------------------------------------------------------------
CREATE TABLE payment (
    TPMID INT,
    TID INT,
    cardhash INT,
    Revenue INT,
    PRIMARY KEY (TPMID, TID)  -- Assuming the combination of TPMID and TID is unique
);

INSERT INTO payment (TPMID, TID, cardhash, Revenue)
VALUES
(23423, 123, 3333, 56),
(34534, 456, 1111, 65),
(23423, 789, 2222, 67),
(32432, 880, 1111, 87),
(54754, 770, 3333, 45),
(98797, 312, NULL, 65),
(14284, 234, 3333, 54);

Solution

I got this to work using a CTE and then self joining based on the hash value. This essentially dupes the cardhash value for every unique TID. This was a fun one, hope it helps.

WITH JOINED AS (SELECT T.TID AS TID
, CID
, REVENUE1
, REVENUE2
, CARDHASH
FROM TRANSACTION T

LEFT JOIN PAYMENT P
ON P.TID = T.TID)
, CARDHASH_DUPES AS (SELECT J.CARDHASH
, J.TID
, R.CID
FROM JOINED J

INNER JOIN JOINED R
ON J.CARDHASH = R.CARDHASH
AND R.CID IS NOT NULL)
SELECT J.TID
, H.CID
, J.REVENUE1
, J.REVENUE2
FROM JOINED J

LEFT JOIN CARDHASH_DUPES H
ON J.TID = H.TID
;