Search code examples
windowsfilesystemsntfsmicrosoft-distributed-file-system

What is this file SimilarityTable_1 (8GB)


I created a new filesystem (F: - destination) and synced with another (E:\ - source). Curiously my new FS had 7GB of difference from source FS.

I found out that the file below has 8GB (size) and 1GB (used) in source and 8GB (size) and 8GB (used) in destination.

I don't know if I can delete this file safely. I think no!

E:\System Volume Information\DFSR\SimilarityTable_1

Questions: (1) What's this file? (2) How can I fix it?


Solution

  • It is an internal database file used to keep track of signatures for content that the distributed file system has seen. This way, given a new signature, it can generate new files based on chunks of data it already has.

    Cross File RDC:

    by using a special hidden sparse file (located in drive:\system volume information\dfsr\similaritytable_1) to track all these signatures, we can use other similar files that we already have to build our copy of a new file locally. Up to five of these similar files can be used. So if an upstream server says "I have file X and here are its RDC signatures", we the downstream server can say "ah, I don't have that file X. But I do have files Y and Z that have some of the same signatures, so I'll grab data from them locally and save you having to transmit it to me over the wire."

    Depending on what you are trying to do, it is safe to delete that directory and clean up the old database files. See this manual for the steps involved (and when to do so).