Search code examples
c#algorithmms-wordchecksum

Is there any metadata associated with word document?


I am trying to generate check sum of a word document by opening at binary level. I generate the check sum of the document. Copy the document to a different location. When I generate the checksum at the new location I get a different value though I haven't changed the contents of the document. The check sum varies even if I copy the document back to the same location. This does not happen with other file types such as .txt or .pdf files. So this proves that there are no bugs in the check sum generation. But what I feel is that by opening a .doc file in binary level, I am generating checksum for metadata of the document which varies. Am I right? Please enlighten me.


Solution

  • .doc files are OLE streams, and .docx files are zip compressed xml files, so the short answer is: yes, there is all manner of metadata attached with a Word document.

    That said, simply copying any file to a new location (as opposed to copying the contents of the file into a new file) shouldn't modify it. How are you copying it?