Search code examples
ruby-on-railschecksumrails-activestorage

Active Storage Checksum confusion


I just uploaded a test file (png image) using Active Storage to Amazon S3.

One think i've noticed is that the etag returned for the file from the S3 API is different from the checksum stored in the blob record in the database for that file.

I ran an MD5 checksum on this site: https://appdevtools.com/checksum-calculator for the file and it matches the S3 etag.

Why is the checksum stored in the DB blob different?

test-file.png:

Amazon S3 eTag via API:                  f1d0a62d6890cf4c4ecb4337c6d789df
`checksum` in Database:                  8dCmLWiQz0xOy0M3xteJ3w==
MD5 Checksum when checking on website:   f1d0a62d6890cf4c4ecb4337c6d789df

Can anyone explain this and also how the one in the database relates to the file?

Thanks


Solution

  • It is the same value, stored differently.

    The value "f1d0a62d6890cf4c4ecb4337c6d789df" is a hex string.

    The value "8dCmLWiQz0xOy0M3xteJ3w==" is a base64-encoded string.

    Both represent the same raw bytes of the checksum.

    To convert from base64 to hex:

    Base64.decode64('8dCmLWiQz0xOy0M3xteJ3w==').unpack('H*')
     => ["f1d0a62d6890cf4c4ecb4337c6d789df"]
    

    To convert from hex to base64:

    Base64.encode64(["f1d0a62d6890cf4c4ecb4337c6d789df"].pack('H*')).chomp
     => "8dCmLWiQz0xOy0M3xteJ3w=="