Search code examples
c++visual-studiowinapivisual-c++ntfs

How to detect a file move under win32 NTFS?


I have a database allowing to attach additional information to any file from the file-system (NTFS). The file ID is its full path, so in order to maintain consistency, I need to watch if any of the files inside the DB are deleted, renamed or moved.

For the moment, I am trying to achieve so by using the ReadDirectoryChangesW function with FILE_NOTIFY_CHANGE_FILE_NAME | FILE_NOTIFY_CHANGE_DIR_NAME as filter criteria.

The problem is that this way, I only get notifications for renaming, creation and deletion. Thus I need to guess when a move occurs, based on 'added' and 'removed' events and related file name (on the same volume, a move [ctrl-x, ctrl-v] is actually a file removal immediately followed by a file creation, paths differ, but file name remains the same).

Does anybody knows if there is a better solution ?

Here is my understanding based on observations:

About moving files under NTFS

Inside a same volume

A 'removed' event is immediately followed (with no or very little delay) by an 'added' event for same filename and different path (whatever the size of the moved file).

Special case if a whole volume is being watched: when a file is deleted it is actually added to recylce bin (path contains volume recycle bin's but filename is different [some kind of hash]).

Between two distinct volumes

First, there is an 'added' event on destination volume.
Afterward, when copy completes, there is a 'removed' event on original volume.

(Note: several events might occur in the meanwhile: the bigger the file, the longer the lag-time.)


Solution

  • There is indeed no way to be notified about a 'moved' event, because such event is always the result of either a path/filename rename or duo events deletion & creation.
    While using USN journal could make things be a little bit easier, there is still some additional work to do.

    In this case, I need to check filesystem changes on the fly (my application is running in background), so there is no point using a log (journal).

    Here is the logic I came up with intended for using with DeviceIoControl and ReadDirectoryChangesW functions and a queue holding custom FileActionInfo items.

    struct FileActionInfo {
        WCHAR fileName[FILE_NAME_MAX];
        CHAR drive;
        DWORD action;
        time_t timeStamp;
    };
    

    That should probably be useful as well for guessing all move events when using USN :

    algorithm

    - when a 'added' event occurs
        - if previous event was a 'removed' event on same volume
            - if 'added' event contains recycle bin path, ignore it (file deleted)
            - else if 'removed' event contains recycle bin path, handle as a 'restored'/'undelete' event, remove 'removed' event from queue
            - else 
                - if 'added' event has same filename, handle as a 'moved' event, remove 'removed' event from queue
                - else push 'added' event to queue
        - else push 'added' event to queue
    
    
    - when a 'removed' event occurs, search the queue for an 'added' event for the same filename on a different volume
        - if found, handle it as a 'moved' event and remove 'added' event from queue
        - else push 'removed' event to queue, launch a delayedRemoval thread
    
    
    delayedRemoval thread(&FileActionInfo) {
        // we cannot wait forever , because 'added' event might never occur (if the file was actually deleted).
        sleep(2000)
        if given 'removed' event is still in the queue
            handle as an actual 'removed' event, and remove it from queue
        return;
    }
    

    exceptions

    • if two files with same filename are created during same session on different volumes and afterward one of them is deleted, this will erroneously be handled as a 'moved' event
    • with the time, FileActionInfo queue might grow big, we could clean it once in a while (setting max delay for moving a file)
      • if copy duration exceed allowed delay, we might miss a 'moved' event
      • the bigger the queue, the longer the time for searching events into it, so we need a quick access : (several objects storing same items with different access modes: vector, hashtable)

    Just in case it might help, here is the FileActionQueue.h I wrote.
    Most important methods being Last() and Search(LPCWSTR fileName, DWORD action, PCHAR drives).

    #pragma once
    
    
    #include <time.h>
    #include <string>
    #include <cwchar>
    #include <vector>
    #include <map>
    
    using std::wstring;
    using std::vector;
    using std::map;
    
    /* constants defined in winnt.h :
    #define FILE_ACTION_ADDED                   0x00000001   
    #define FILE_ACTION_REMOVED                 0x00000002   
    #define FILE_ACTION_MODIFIED                0x00000003   
    #define FILE_ACTION_RENAMED_OLD_NAME        0x00000004   
    #define FILE_ACTION_RENAMED_NEW_NAME        0x00000005
    */
    #define FILE_ACTION_MOVED                    0x00000006
    
    class FileActionInfo {
    public:
        LPWSTR    fileName;
        CHAR    drive;
        DWORD    action;
        time_t    timestamp;
        
        FileActionInfo(LPCWSTR fileName, CHAR drive, DWORD action) {        
            this->fileName = (WCHAR*) GlobalAlloc(GPTR, sizeof(WCHAR)*(wcslen(fileName)+1));
            wcscpy(this->fileName, fileName);
            this->drive = drive;
            this->action = action;
            this->timestamp = time(NULL);
        }
        
        ~FileActionInfo() {
            GlobalFree(this->fileName);    
        }
    };
    
    /*
    There are two structures storing pointers to FileActionInfo items : a vector and a map. 
    This is because we need to be able to:
    1) quickly retrieve the latest added item
    2) quickly search among all queued items (in which case we use fileName as hashcode)
    */
    class FileActionQueue {
    private:
        vector<FileActionInfo*> *qActionQueue;
        map<wstring, vector<FileActionInfo*>> *mActionMap;
    
        void Queue(vector<FileActionInfo*> *v, FileActionInfo* lpAction) {
            v->push_back(lpAction);
        }
    
        void Dequeue(vector<FileActionInfo*> *v, FileActionInfo* lpAction) {
            for(int i = 0, nCount = v->size(); i < nCount; ++i){
                if(lpAction == v->at(i)) {
                    v->erase(v->begin() + i);
                    break;
                }
            }
        }
    
    public:
    
        FileActionQueue() {
            this->qActionQueue = new vector<FileActionInfo*>;
            this->mActionMap = new map<wstring, vector<FileActionInfo*>>;
        }
        
        ~FileActionQueue() {
            delete qActionQueue;
            delete mActionMap;    
        }
        
        void Add(FileActionInfo* lpAction) {
            this->Queue(&((*this->mActionMap)[lpAction->fileName]), lpAction);
            this->Queue(this->qActionQueue, lpAction);
        }
    
        void Remove(FileActionInfo* lpAction) {
            this->Dequeue(&((*this->mActionMap)[lpAction->fileName]), lpAction);
            this->Dequeue(this->qActionQueue, lpAction);
        }
    
        FileActionInfo* Last() {
            vector<FileActionInfo*> *v = this->qActionQueue;
            if(v->size() == 0) return NULL;
            return v->at(v->size()-1);
        }
    
        FileActionInfo* Search(LPCWSTR fileName, DWORD action, PCHAR drives) {
            FileActionInfo* result = NULL;
            vector<FileActionInfo*> *v;
            if( v = &((*this->mActionMap)[fileName])) {
                for(int i = 0, nCount = v->size(); i < nCount && !result; ++i){
                    FileActionInfo* lpAction = v->at(i);
                    if(wcscmp(lpAction->fileName, fileName) == 0 && lpAction->action == action) {
                        int j = 0;
                        while(drives[j] && !result) {
                            if(lpAction->drive == drives[j]) result = lpAction;
                            ++j;
                        }            
                    }
                }
            }
            return result;
        }
    };