Search code examples
javapostgresqljdbcdatabase-replication

How to process the PGReplicationStream from physical replication JDBC API?


I am trying to read the data from physical replication slot using JDBC's ReplicationAPI

I am using Postgres 13

Here is the sample code

        LogSequenceNumber lsn = getCurrentLSN(sqlConnection);

        PGReplicationStream stream =
                pgConnection
                        .getReplicationAPI()
                        .replicationStream()
                        .physical()
                        .withStartPosition(lsn)
                        .start();

        ByteBuffer buffer = stream.read();

        int offset = buffer.arrayOffset();
        byte[] source = buffer.array();
        int length = source.length - offset;
        System.out.println(new String(source, offset, length));

My code outputs the garbled string, Is there right a way to read the bytes returned from the PGReplicationStream I have not found a single document as to how to process the bytes being returned from PGReplicationStream

The DML inserted to DB: insert into t2(id,name) values(1,'Jane Doe');

Garbled string Ouput: D�3h �L� �4@�Jane Doe.�3h�A$���9��'���4

HexString of Byte Array: 7700000000016833c00000000001683438000298be27a1a28f440000000e0200008833680100000000000a0000f24ce4b1002013007f060000843400001f40000000000000ff0302000208180001000000134a616e6520446f65030008000000002e0000000e020000c03368010000000080010000412497e4ff14399ea127be98020001000000843400007f0600000000


Solution

  • With (physical) streaming replication, the transaction log (WAL) is transferred as it is, which is very low-level binary information. Think of it as “in file X, replace the 42 bytes starting at offset 123 with the following: ...”. That is information that does not make any sense outside the database itself (or outside a streaming replication standby server, which is a physical copy).

    There is no way you can reliably extract useful information from WAL. If you need that, you have to use logical decoding, where a logical decoding plugin on the primary server reverse engineers the WAL information into useful logical information.

    The use cases for streaming physical information are very limited. One use case I can imagine is if you want to write a WAL archival program in Java, similar to the built-in pg_receivewal.