I have written a python code which convert raw data (STM Microscope) into png format and it run perfectly on my Macbook Pro.
Below is the simplified Python Code:
for root, dirs, file in os.walk(path):
for dir in dirs:
fpath = path +'/'+ dir
os.chdir(fpath)
spaths=savepath +'/'+ dir
if os.path.exists(spaths) ==False:
os.mkdir(spaths)
for files in glob.glob("*.sm4"):
for file in files:
data_conv (files, file, spaths)
But it does take 30 - 40 mins for100 files.
Now, I wanted to reduce processing time using multithreading technique (using “concurrent future” library). Was trying to modify python code using YouTube video on “Python Threading Tutorial” as an example.
But I have to pass too many arguments such as “root”, “dirs.”, “file” in the executor.map() method. I don’t know how to resolve this further.
Below this the simplified multithreading Python code
def raw_data (root, dirs, file):
for dir in dirs:
fpath = path +'/'+ dir
os.chdir(fpath)
spaths=savepath +'/'+ dir
if os.path.exists(spaths)==False:
os.mkdir(spaths)
for files in glob.glob("*.sm4"):
for file in files:
data_conv(files, file, spaths)
with concurrent.futures.ThreadPoolExecutor() as executor:
executor.map(raw_data, root, dirs, file)
NameError: name 'root' is not defined
Any suggestion is appreciated, Thank You.
Thanks for the advice Iain Shelvington & Thenoneman.
Pathlib does reduces the clutter I was having in my code.
"ProcessPoolExecutor" worked in my CPU intense function.
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(raw_data, os.walk(path))