Search code examples
pythonflaskmultiprocessingpython-multiprocessing

Can a separate process be started from within a flask controller by using the multiprocessing library?


Problem:

I have been trying to use the multiprocessing library (anaconda - python3) in a flask application. The goal is to start a process asynchronously after clicking some button in the web page. To that end, the webpage button fires a function in a controller:

@module.route('/start_long_process', methods=['POST'])
def start_long_process():
     ...
     # Start process async
     p = Process(target=some_function)
     p.start()
     ...

If I then afterwards check whether the process is alive by doing this:

     print(p.pid)
     print(p.is_alive())

the pid and True value are returned.

However, it seems as if the function never actually started. Any print or simply write to file command does not take place. While when I try to run this same function by using a multiprocessing Process not being in the flask application it does do what is expected.

I know preferably something as Celery is used in combination with flask and multi-threaded tasks but that would be a lot of overhead.

Question

Is it not possible to start a separate process from within a flask controller function?


Solution

  • If you are using development flask server you can structure your app in a similar fashion:

    from multiprocessing import Pool
    from flask import Flask
    
    app = Flask(__name__)
    _pool = None
    
    
    def wanted_function(x):
        # import packages that is used in this function
        # do your expensive time consuming process
        return x * x
    
    
    @app.route('/expensive_calc/<int:x>')
    def route_expcalc(x):
        f = _pool.apply_async(wanted_function, (x,))
        r = f.get(timeout=2)
    
        return 'Result is %d' % r
    
    
    if __name__ == '__main__':
        _pool = Pool(processes=4)
        try:
            # insert production server deployment code
            app.run()
        except KeyboardInterrupt:
            _pool.close()
            _pool.join()
    

    Then you run it with ./<script_name>.py and you will have both flask app listening and _pool available for processes.

    However, if you plan on setting up more serious solution you will probably want to use web server with WSGI support.

    In that way, your flask app will not be started as in development server case. That way you won't have access to the _pool so you should think to use more serious task queues like Celery. Here is the list of all good ones, some of which are easier to set up and use than Celery.

    Windows

    In case you are trying to do this on Windows, you will have difficulties.

    Apparently, the way child process is started on Windows differs. You can check documentation for more info but this is important part:

    spawn

    The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process objects run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver.

    Available on Unix and Windows. The default on Windows.

    fork

    The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.

    Available on Unix only. The default on Unix.

    forkserver

    When the program starts and selects the forkserver start method, a server process is started. From then on, whenever a new process is needed, the parent process connects to the server and requests that it fork a new process. The fork server process is single threaded so it is safe for it to use os.fork(). No unnecessary resources are inherited.

    Available on Unix platforms which support passing file descriptors over Unix pipes.