Search code examples
checksumzfs

Access checksum of zfs dataset via cli


Is it possible to read/access the checksum of a zfs dataset? I want to access it to validate that it didnt change between boots. Reading https://en.wikipedia.org/wiki/ZFS#ZFS_data_integrity: Is the top checksum of a Merkle Tree like checksuming scheme in zfs accessible from userspace?


Solution

  • There's a (mainly for developers) tool called zdb which can do this. It's hard to use and its format is not always backwards compatible :-)

    However, if all you want is to make sure that a filesystem hasn't changed, you can use snapshots for this purpose. First, create a snapshot at the point you want to compare to later on with zfs snapshot <pool>/<fs>@<before-reboot-snap>. Then there are two different ways to compare the filesystem to that snapshot later:

    1. After reboot, run zfs diff <pool>/<fs>@<before-reboot-snap> <pool>/<fs>. This will show you a list of "diffs" between the snapshot and the current filesystem:

      # ls /tank/hello
      file1  file2  file3  file4  file5
      # zfs snapshot tank/hello@snap
      # zfs diff tank/hello@snap tank/hello
      # touch /tank/hello/file6
      # zfs diff tank/hello@snap tank/hello
      M       /tank/hello/
      +       /tank/hello/file6
      # rm /tank/hello/file6
      # zfs diff tank/hello@snap tank/hello
      M       /tank/hello/
      

      Note that even after I deleted the new file, the directory it lived in is still marked as modified.

    2. Take another snapshot after the reboot, and then use zfs send -i @<before-reboot-snap> <pool>/<fs>@<after-reboot-snap> to create a stream of all the changes that happened between those snapshots, and analyze it with another tool called zstreamdump:

      zfs send -i @snap tank/hello@snap2 | zstreamdump
      BEGIN record
              hdrtype = 1
              features = 4
              magic = 2f5bacbac
              creation_time = 59036f98
              type = 2
              flags = 0x4
              toguid = 2f080aca53bff68e
              fromguid = 66a1da82cd5f1571
              toname = tank/hello@snap2
      END checksum = 91043406e5/38f3c4043049b/ed0867661876670/1e265bea2b6c3315
      SUMMARY:
              Total DRR_BEGIN records = 1
              Total DRR_END records = 1
              Total DRR_OBJECT records = 12
              Total DRR_FREEOBJECTS records = 5
              Total DRR_WRITE records = 1
              Total DRR_WRITE_BYREF records = 0
              Total DRR_WRITE_EMBEDDED records = 0
              Total DRR_FREE records = 17
              Total DRR_SPILL records = 0
              Total records = 37
              Total write size = 512 (0x200)
              Total stream length = 13232 (0x33b0)
      

      The example above shows that there have been a bunch of diffs -- anything like WRITE, FREE, OBJECT, or FREEOBJECTS indicates a change from the original snapshot.