Imagine file-based caching of some processes result on linux machine.
- We're making the process (resource-consuming) only, when there's change in the source data.
- With every query for the result, we're checking, if the base data are changed.
- If the data changes, we process the data and save the cache.
- Checking changes and cache freshness (if created after last change) is made by querying files modification time (source data & the cache file).
Tricky part: The process takes some time - and there might come change to the data, while we're processing them. Is changing modification time of the cache to its creating query time safe?
It's something like:
- Source data changed at 20:00:01.
- Query came at 20:00:05 - we're recreating cache.
- Finished on 20:04:15.
- Saving cache.
- Changing modification time of cache file to 20:00:05 (to show, that every change after 20:00:05 is not counted).
Is it safe? For backups, deployment, source control... What situations might create problems with such a solution?
No. The safe solution is to write the file with a temporary name into the same directory and then rename it after writing all data to it.
- This way, you will never have an incomplete/truncated file
- If you have an error, the original data will still be intact
- For error handling, you just need to delete the temporary file
It also solves all the problems with backup, source control: You can make them ignore the temp files.