Search code examples
pythonflaskwerkzeugmultiple-processes

Werkzeug and class state with Flask: How are class member variables resetting when the class isn't being reinitialized?


I'm trying to write a flask extension that needs to persist some information between requests. This works fine when I run Werkzeug with a single process but when I run with multiple processes I get some odd behavior that I don't understand. Take this simple application as an example:

from flask import Flask
app = Flask(__name__)

class Counter(object):
    def __init__(self, app):
        print('initializing a Counter object')
        self.app = app
        self.value = 0

    def increment(self):
        self.value += 1
        print('Just incremented, current value is ', self.value)

counter = Counter(app)

@app.route('/')
def index():
    for i in range(4):
        counter.increment()
    return 'index'

if __name__ == '__main__':
    #scenario 1 - single process
    #app.run()
    #scenario 2 - threaded
    #app.run(threaded=True)
    #scenario 3 - two processes
    app.run(processes=2)

For the first two scenarios it behaves exactly as I would expect: the Counter object is initialized once and then it increments with every request to the '/' route. When I run it with the third scenario (passing processes=2) then I get this as output:

 initializing a Counter object
  * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 Just incremented, current value is  1
 Just incremented, current value is  2
 Just incremented, current value is  3
 Just incremented, current value is  4
 127.0.0.1 - - [30/Aug/2015 09:47:25] "GET / HTTP/1.1" 200 -
 Just incremented, current value is  1
 Just incremented, current value is  2
 Just incremented, current value is  3
 Just incremented, current value is  4
 127.0.0.1 - - [30/Aug/2015 09:47:26] "GET / HTTP/1.1" 200 -
 Just incremented, current value is  1
 Just incremented, current value is  2
 Just incremented, current value is  3
 Just incremented, current value is  4
 127.0.0.1 - - [30/Aug/2015 09:47:27] "GET / HTTP/1.1" 200 -

It seems that counter.value is returning to it's state right after being initialized without actually being re-initialized. Could somebody shed some light on what Werkzeug is doing internally to make this happen? I'd also be very interested in learning if there is a way to make this behave as I would naively expect (two processes, each with their own instance of Counter). Thanks!


Solution

  • The first example (single thread) just uses the one Counter, so it works.

    The second example (multiple threads), threads are spawned to handle each request. They share the memory with the one Counter that was created before they spawn, so incrementing them from each increments the same thing.

    The last example (multiple processes), processes are spawned to handle each request. Flask's dev server uses fork: each child sees the same starting point (counter is already initialized) but increments in their own address space which goes away when the request ends.

    import os
    
    class Counter:
        def __init__(self):
            print('init')
            self.value = 0
    
        def increment(self):
            self.value += 1
            print('inc -> {}'.format(self.value))
    
    counter = Counter()
    
    def multi():
        if not os.fork():
            # child starts with copy of parent memory
            for _ in range(3):
                # increments three times
                counter.increment()
    
            # child is done
            os._exit(0)
    
    # three processes run
    for _ in range(3):
        multi()
    
    init
    inc -> 1
    inc -> 2
    inc -> 3
    inc -> 1
    inc -> 2
    inc -> 3
    inc -> 1
    inc -> 2
    inc -> 3
    

    Use a database or other external storage to store global state across processes, using before_ and after_request. Note that it's not exactly straightforward, since you'll have to make storing the incremented value of the counter from each request thread-safe, so that two threads don't overwrite the value at the same time.

    req 1 starts, gets stored value = 4
    req 2 starts, gets stored value = 4
    req 1 increments, value = 8
    req 1 saves, value = 8
    req 2 increments, value = 8
    req 2 saves, value = 8 but should = 12