According to this tutorial when I update file I should remove file from under DVC control first (i.e. execute dvc unprotect <myfile>.dvc
or dvc remove <myfile>.dvc
) and then add it again via dvc add <mifile>
. However It's not clear if I should apply the same workflow for the directories.
I have the directory under DVC control with the following structure:
data/
1.jpg
2.jpg
Should I run dvc unprotect data
every time the directory content is updated?
More specifically I'm interested if I should run dvc unprotect data
in the following use cases:
3.jpg
image in the data dir2.jpg
image in the data
dir1.jpg
image via graphic editor.Only when file is updated - i.e. edit 1.jpg
with your editor AND only if hadrlink or symlink cache type is enabled.
Please, check this link:
updating tracked files has to be carried out with caution to avoid data corruption when the DVC config option cache.type is set to hardlink or/and symlink
I would strongly recommend reading this document: Performance Optimization for Large Files it explains benefits of using hardlinks/symlinks.