Search code examples
postgresqlchecksum

Checksum field in PostgreSQL to content comparison


I have a field in a table with large content (text or binary data). If I want to know if another text is equals this one, I can use a checksum to compare the two texts. I can define this field as UNIQUE to avoid repeated content too.

My doubt is if I create a checksum field, this comparison will speed up, so PostgreSQL already does this (without need programmer intervention) or I need do this manually?

EDIT: What is better, create a checksum for a TEXT field, use a checksum for it or the two ways are the same thing?


Solution

  • There is no default "checksum" for large columns in PostgreSQL, you will have to implement one yourself.

    Reply to comment

    Hash indexes provide fast performance for equality checks. And they are updated automatically. But they are not fully integrated in PostgreSQL (yet), so their use is discouraged - read the manual.

    And you cannot query the values and use them in your application for instance. You could do that with a checksum column, but you need to add an index for performance if your table is big and maintain the column. I would use a trigger BEFORE INSERT OR UPDATE for that.

    So, a hash index may or may not be for you. @A.H.'s idea certainly fits the problem ...