Search code examples
pythonmultithreadingtkintermultiprocessingprogress-bar

Threading and Multiprocessing within Tkinter


I am building an app that can batch a group large STL files, load them, check them for issues and repair any holes, non-manifold geometry, etc to prepair them for 3D printing.

I am trying to implement multiprocessing of the repair step as it can take a long time for larger files and I have a PC with 64 cores that can make short work of a long list of parts. While it is doing this, I don't want to lock the GUI, and I want to be able to update a progress bar and label to show the user how it is running.

I am using customer tkinter to build the GUI and have everything within a class call App. Since I read we have to contain the tkinter GUI within a single thread, I used the threading module to create a secondary thread to run the repair step. That thread then spawns numerous processes to tackle each part. As it is processing the part, it updates the objects and the main class to let it know the part has been repaired.

While the above thread is running, I then have a while loop in the main thread to continually check to see how many parts have been repaired and update the progressbar and label.

Below is a simplified version of the application. I removed some downstream processes for clarity, it returns the same error as the full version.

This is the error I get, it seems to be happening in the repair_pool function, when I try to start the processes in the secondary thread. It seems to be trying to pickle the tkinter GUI, but as far as I can see, the GUI is not involved with the secondary process. What am I doing wrong? Is it because the main class App is inheriting properties from the customtkinter class?

**Exception in thread Thread-1:
Traceback (most recent call last):**
  File "\threading.py", line 973, in _bootstrap_inner
    self.run()
  File "\threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "\app.py", line 164, in repair_pool
    process.start()
  File "\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
**TypeError: cannot pickle '_tkinter.tkapp' object
 Traceback (most recent call last):**
  File "<string>", line 1, in <module>
  File "\multiprocessing\spawn.py", line 107, in spawn_main
    new_handle = reduction.duplicate(pipe_handle,
      File "\multiprocessing\reduction.py", line 79, in duplicate
    return _winapi.DuplicateHandle(
**OSError: [WinError 6] The handle is invalid**

The code I am using...

# from standard library
from tkinter import messagebox
from tkinter import filedialog
import shutil
import os
import time
from multiprocessing import Process
import threading

# external modules

import customtkinter
import numpy as np
import pymeshfix
from stl import mesh


class App(customtkinter.CTk):
    def __init__(self):
        # build the GUI
        super().__init__()
        self.importFolder = ""   # folder where batch of in files are
        self.partsLoaded = False   # will not let us run the repair step until parts are loaded
        self.partsRepaired = False   # will not let us proceed until parts are repaired
        self.parts = []   # list to store part objects
        self.total_parts = 0    # count of how many parts have been added
        self.repaired_parts = 0   # count of how many parts have been repaired

        self.title("3D Part Repair")
        self.geometry("500x500")   # main window
        self.frame = customtkinter.CTkFrame(self)
        self.frame.grid(row=4, column=1)

        # button to import parts
        self.importButton = customtkinter.CTkButton(self,
                                                    text="Import Parts",
                                                    command=self.add_parts
                                                    )
        self.importButton.grid(row=1,
                               column=1,
                               pady=20
                               )

        # lable to show how many parts are in the folder chosen with the importButton
        self.partsLabel = customtkinter.CTkLabel(self, text="No Parts Loaded")
        self.partsLabel.grid(row=2,
                             column=1,
                             pady=20
                             )

        # progressbar to update as parts are repaired
        self.partsProgress = customtkinter.CTkProgressBar(self, orientation='horizontal')
        self.partsProgress.set(0)
        self.partsProgress.grid(row=3,
                                column=1,
                                pady=20
                                )

        # when pressed, creats a new thread to track the multiproessing of large mesh files
        self.repairButton = customtkinter.CTkButton(self,
                                                    text="Check/Repair Parts",
                                                    command=self.run_repair
                                                    )
        self.repairButton.grid(row=4,
                               column=1,
                               pady=20
                               )


    def add_parts(self):
        # first check if parts are loaded, as we might not want to replaced
        if self.partsLoaded:
            # asks the user if they want to process
            answer = messagebox.askyesno("Alert", "You already have parts in que.\n"
                                                  "By proceeding, you will delete all parts.\n"
                                                  "Would you like to proceed?")

            # if they say yes (True), then clear all parts out
            if answer:
                self.partsRepaired = False
                self.partsLoaded = False
                # also clear out the packer of any jobs, and parts/items
                self.parts.clear()
                self.total_parts = 0
                self.repaired_parts = 0

        # otherwise, we currently do not have any parts loaded
        else:
            # prompts user to pick an import folder
            self.importFolder = filedialog.askdirectory(initialdir='/', title="Select a Folder")

            # if the user presses cancel, this will catch the
            if self.importFolder != '':
                # loop through files in input directory
                for file in os.listdir(self.importFolder):
                    # check if the result is a file and not a subdirectory
                    if os.path.isfile(os.path.join(self.importFolder, file)):
                        fileExt = file.split('.')[1]
                        part = Part(file)
                        self.parts.append(part)

            # update label
            if len(self.parts)> 0 :
                self.partsLoaded = True
                self.partsLabel.configure(text=f"{len(self.parts)} total parts added.")
            else:
                messagebox.showerror("Alert", "No parts in folder!")

    def run_repair(self):
        """
        repairs all parts added to app, will spawn a secondary thead to run any multiprocessing

        """

        # first check if the script has already been run
        if not self.partsLoaded:
            messagebox.showerror("Alert", "No Parts loaded!")
            return

        if self.partsRepaired:
            answer = messagebox.askyesno(title="Alert", message="The repair script has already been ran.\n"
                                                                "Run again?")
            if not answer:
                return

        # makes a temporary cache folder to save repaired parts to
        if not os.path.isdir('cache'):
            os.mkdir("cache")
        else:
            # remove first to delete any possible parts
            shutil.rmtree("cache")
            os.mkdir("cache")

        # create thread to run the multiprocessor
        repair_thread = threading.Thread(target=self.repair_pool)
        repair_thread.start()
        repair_thread.join()

        # start a timer and a timeout so we can break out of the below loop if it runs too long
        start = time.time()
        timeout = 300

        # the below loop will check the self.parts_repaired variable every second
        # as it is updated via the above thread, it will chance the value in the progressbar
        # and label
        while self.repaired_parts < self.total_parts:
            time.sleep(1)
            self.partsProgress.set(self.repaired_parts/self.total_parts)
            self.partsLabel.configure(text=f"{self.repaired_parts} repaired out of {self.total_parts}")
            if time.time() - start > timeout:
                print("Timed out of repair script")
                break

        # after all parts repaired
        self.partsRepaired = True
        self.repairedFiles = [name for name in os.listdir('cache') if os.path.isfile(name)]
        self.partsLabel.configure(text="All parts repaired!")

    def repair_pool(self):
        # this function is called in a secondary thread to multiprocess the repair of each part
        processes = [Process(target=self.mesh_repair, args=(part,)) for part in self.parts]
        # start the processes
        for process in processes:
            process.start()
        # join them together
        for process in processes:
            process.join()

    def mesh_repair(self, part):
        # reads mesh data, fills in small holes, calculates the bounding box of the part
        # also reads the size and current position for down stream processes so we don't have to open and read the file multiple times
        meshFileIn = os.path.join(self.importFolder, part.file)
        print(f"Working on part: {part.file}")
        tin = pymeshfix.PyTMesh()
        tin.load_file(meshFileIn)

        # Fill holes
        tin.fill_small_boundaries()

        # return numpy arrays
        vclean, fclean = tin.return_arrays()

        # determine part bounding box
        bboxMin = np.min(vclean, axis=0)
        bboxMax = np.max(vclean, axis=0)

        # calculate the width, height and depth (x,y,z)
        part.width = bboxMax[0] - bboxMin[0]
        part.height = bboxMax[1] - bboxMin[1]
        part.depth = bboxMax[2] - bboxMin[2]

        # if the part is not placed a (x, y, z) = (0, 0, 0), add offsets to prevent bad placement
        part.position_offsets = [
            0 - bboxMin[0],
            0 - bboxMin[1],
            0 - bboxMin[2]
        ]

        # create the repaired file
        outFile = mesh.Mesh(np.zeros(fclean.shape[0], dtype=mesh.Mesh.dtype))
        for i, f in enumerate(fclean):
            for j in range(3):
                outFile.vectors[i][j] = vclean[f[j], :]

        # create the name and export
        outFileName = part.file.split('.')[0] + '_repaired.' + part.file.split('.')[1]
        outFile.save(os.path.join('cache', outFileName))

        self.repaired_parts += 1

        print(f"Finished part: {part.file}")


class Part:
    """
    Part object used to store data about the part
    """
    def __init__(self, file):

        self.file = file
        self.type = 0
        self.width = 0
        self.height = 0
        self.depth = 0
        self.rotation_type = 0
        self.position_offset = [0, 0, 0]
        self.position = [0, 0, 0]


    def get_volume(self):
        # print(self.name)
        return self.width * self.height * self.depth


if __name__ == "__main__":
    app = App()
    app.mainloop()

Solution

  • Here's an example of how to re-structure this snippet to move the target function outside the class containing un-picklable tkinter objects. (because I don't have all your libraries, I cannot test this fully so please mostly try to read and understand the comments and structure change).

    from multiprocessing import Pool #pool has lots of great functionality already written for you like returning results from a child process
    
    ...
    
    class App:
        
        ...
        
        def repair_pool(self):
            # this function is called in a secondary thread to multiprocess the repair of each part
            
            #build a list of args for pool.imap_unordered
            arglist = [(partindex, self.importFolder, part) for partindex, part in enumerate(self.parts)] #enumerate so we can index into parts and replace old parts with new parts after computation
            # start the process pool with default number of processes (same as cpu cores)
            with Pool() as pool:
                for partindex, part in pool.imap_unordered(mesh_repair, arglist):
                    self.parts[partindex] = part #if you don't care about the order, you could skip all the partindex stuff, and simply delete and re-create self.parts here
                    self.repaired_parts += 1 #updating this here instead of using some sort of inter-process shared value.
                pool.close() #close and join are not strictly needed here, but you would need them if you are using any of the async functions, so it's good practice to remember.
                pool.join()
    
    #mesh_repair needs to be a function you can import directly (top level definition), or if it is a class method, the class instance must be fully picklable.
    def mesh_repair(args): #needs a single arg unless we use starmap (and we want imap_unordered so we can get progress updates as they complete)
        #the only thing we need from self is the import folder anyway
        partindex, importFolder, part = args #unpack args
        
        # reads mesh data, fills in small holes, calculates the bounding box of the part
        # also reads the size and current position for down stream processes so we don't have to open and read the file multiple times
        meshFileIn = os.path.join(importFolder, part.file)
        print(f"Working on part: {part.file}")
        
        ...
        #nothing changed in-between here
        ...
    
        #self.repaired_parts += 1 #this wouldn't work anyway because it is not a sharedctypes.Value or any other type of shared object
                                  #we will instead increment this counter in the main process as pool.imap_unordered returns results
    
        print(f"Finished part: {part.file}")
        return partindex, part #we need to return the index and the new part (it was copied when it was sent to the child process)
                               #so we can insert it back into the correct place in the App.parts list. (they may come back out of order)