Search code examples
pythonmultiprocessingpool

Control-C Handling with Multiprocessing, Pool, and reading files


I am just getting started with Multiprocessing & Python, and I am needing some help catching Control-C in my program. The script that I am making is going to read in a file, and then perform some tasks on each line. Before anyone comments on I/O and the advantages/disadvantages of multiprocessing, I am aware :) these tasks lend themselves to be very multi-threaded friendly.

I have the following code, and from the documentation, I would expect it to work, however it is not catching my keyboard exception! ARRGH... Please help

Running on Win10 if that makes any difference:

from multiprocessing import cpu_count
from multiprocessing.dummy import Pool as ThreadPool
import argparse
from time import sleep
import signal
import sys


def readfile(file):
  with open(file, 'r') as file:
    data = file.readlines()
    file.close()
  return data

def work(line):
  while(True):
    try:
      print(f"\rgoing to do some work on {line}")
      countdown(5)
    except (KeyboardInterrupt, SystemExit):
      print("Exiting...")
      break

def countdown(time=30):
  sleep(time)

def parseArgs(args):
  if args.verbose:
    verbose = True
    print("[+] Verbosity turned on")
  else:
    verbose = False
  if args.threads:
    threads = args.threads
  else:
    threads = cpu_count()
  print(f'[+] Using {threads} threads')
  return threads, verbose, args.file


if __name__ == '__main__':
  parser = argparse.ArgumentParser()
  parser.add_argument("-f", "--file", required = True, help="Insert the flie you plan on parsing")
  parser.add_argument("-t", "--threads", help="Number of threads, by default will use all available processors")
  parser.add_argument("-v", "--verbose", help="increase output verbosity",
                       action="store_true")
  threads, verbose, filename = parseArgs(parser.parse_args())
  #read the entire file and store it in a variable:
  data = readfile(filename)
  #Init the data pool
  pool = ThreadPool(threads) # Number of threads going to use
  try:
    pool.map(work,data) # This launches the workers at the function to do work
  except KeyboardInterrupt:
    print("Exiting...")
  finally:
    pool.close()
    pool.join()

Solution

  • At the time when you use Control-C the program probably is at pool.join() waiting for all threads to be finished. The pool.map function just starts all the processes but does not block. This means that at the time the KeyboardInterrupt occurs it is not catched because the program is not inside the try-block.

    I am not too sure about the best practices here but I would try:

    try: 
        pool.map(work, data) # This launches the workers at the function to do work 
        pool.close() 
        pool.join()
    except KeyboardInterrupt: 
        print("Exiting...")