Search code examples
pythonmultithreadinguser-interfaceconcurrencyrace-condition

How to prevent race conditions between the UI thread and a different one who both access the same list without blocking the UI thread?


I'm currently working on a GUI toolkit from the ground up in Pygame just for the learning experience, and I'm running into a problem while working on the directory-tree widget. Let me explain the multi-threading system I'm using.

All node objects are stored in a flat list. When a node is expanded, the UI thread adds a node which displays Loading... as a child-node to the expanded node, and it starts a background thread which will load the directory structure that the expanded node represents, and push the result into a queue that the UI thread and loader thread share. There's no limit to how many "loader threads" can run at the same time, as they will all dump the loaded data into a tuple in the same queue. The UI thread will empty the queue every frame and create all child-nodes needed to represent the newly loaded directories under the correct parent nodes. And if a node is un-expanded, the UI thread simply deletes all nodes after the index of the un-expanded node which have a nested level greater than it. Also, when a specific node is expanded for the first time, a cache file which stores the sub-directory data of the directory that the node represents is created. If the exact same node is expanded again in the future, directory data will be read from the cache instead of from the filesystem to optimize the loading speed.

Everything works up to here, but then I realized that with the current design, the treeview only checks the filesystem the first time each node is expanded. So once a node is expanded, changes to the filesystem would not be reflected even if the node is re-expanded (since it would read from the cache). So I wrote a 'refresh' function which simply deletes the entire cache directory using shutil.rmtree. That way, when a node is expanded/re-expanded after the refresh, it will read from the filesystem and re-create the cache for that directory. But the problem is, removing the cache directory won't affect nodes that are already expanded. The changes for those won't be reflected until they are re-expanded. Therefore, I wrote an algorithm that checks through the entire node-list and updates it to match the filesystem (it will detech directories that exist in the filesystem but not the treeview and add them, and will also remove nodes that no longer exist in the filesystem). And of course, to prevent blocking the UI thread, the program spawns a new thread when the 'refresh' button is clicked. And here's where the problem began: what do I do if the user collapses a node while the 'refresh thread' is running? I've already added code to stop the user from expanding nodes by making newly spawned 'loading threads' detect if the 'refresh thread' is running, and suspend itself if it is (which would make the expanded node display Loading... below it until the 'loading thread' in charge of it is allowed to proceed and finishes loading), but what do I do about collapsing? It wouldn't make sense to make collapsing a node say 'loading'. But if I don't stop the user from collapsing nodes, the main thread would modify the node-list while the 'refresh thread' is actively reading and writing to it. This would both mess up the algorithm in the 'refresh thread' and (probably) corrupt data in the node-list.

After some thinking, I thought of three faint ideas, but they all aren't very good... My first idea is to use a 'threading.Lock' object in both the main thread and the 'refresh thread', and (in both threads) lock it every time right before making a change to the list, and unlock it right after the statement which modifies the list. But although this would make sure the data in the list won't be left corrupted due to being written to by two threads at the same time, it won't stop the main thread from changing the length and content of the list while the algorithm in the 'refresh thread' is working (which would mess it up). I could also make the thread lock the lock at the beginning and keep it locked until just before the thread ends. But then that would mean if the user collapses a node after clicking refresh, the UI thread would freeze until the 'refresh thread' is done...

My other idea is to modify the 'refresh thread' so that it makes a copy of the node-list when it starts, and modifies that instead of the real node-list. Then when the thread is done, I'll assign that copy of the list to self.nodes (the variable used to store the list), which would replace it. But the problem with this plan is that this would mean all changes made to the list by the user during the time the 'refresh thread' is running will be overwritten.

My third idea is to modify the 'refresh thread' so that instead of actually modifying the node-list, it adds multiple items to a queue to tell the main thread which indexes in the node-list to remove, and what items to add at which index. But this probably would be really hard to implement, since every time the user collapses a node, the indexes of the same items would change...

I'm simply out of ideas on how to proceed at this point, unless I really stop the user from collapsing nodes while the 'refresh thread' is running. So I would really appreciate a suggestion on what to do.


Solution

  • I've given up on making the dir-tree widget and thought of a way I could continue my project without finishing it. The reason I had to make that dir-tree widget is because I wanted my own open-file-dialog that I could use in my Pygame application. But due to being unable to finish it, I'm instead just going to start a new subprocess and use the file-dialogs provided by tkinter.filedialog in that process. Since the game and the tkinter file-dialog are in two separate processes, they won't interfer with each other. And when the user finishes selecting a file, the subprocess will print the selected path to STDOUT and exit. Then the main process can read the STDOUT of the finished process and obtain the selected path.