Search code examples
pythonclosuresstate

How to maintain state in Python without classes?


Are there pythonic ways to maintain state (for purposes of optimisation, for example) without going fully object-oriented?

To illustrate my question better, here's an example of a pattern I use frequently in JavaScript:

var someFunc = (function () {
    var foo = some_expensive_initialization_operation();
    return someFunc (bar) {
        // do something with foo and bar
    }
}());

Externally this is just a function like any other, with no need to initialise objects or anything like that, but the closure allows computing values a single time that I then essentially use as constants.

An example of this in Python is when optimising regular expressions - it's useful to use re.compile and stored the compiled version for match and search operations.

The only ways I know of to do this in Python are by setting a variable in the module scope:

compiled_regex = compile_my_regex()

def try_match(m): # In reality I wouldn't wrap it as pointlessly as this
    return compiled_regex.match(m)

Or by creating a class:

class MatcherContainer(object):
    def __init__(self):
        self.compiled_regex = compile_my_regex()
    def try_match(self, m):
        self.compiled_regex.match(m)

my_matcher = MatcherContainer()

The former approach is ad-hoc and it's not very clear that the function and the variable declared above it are associated with each other. It also sits pollutes the module's namespace a bit, which I'm not too happy with.

The latter approach seems verbose and a bit heavy on the boilerplate.

The only other way I can think of to deal with this is to factor any functions like this out into separate files (modules) and just import the functions, so that everything's clean.

Any advice from more experienced Pythoners on how to deal with this? Or do you just not worry about it and get on with solving the problem?


Solution

  • You can also accomplish this with default arguments:

    def try_match(m, re_match=re.compile(r'sldkjlsdjf').match):
        return re_match(m)
    

    since default arguments are only evaluated once, at module import time.

    Or even simpler:

    try_match = lambda m, re_match=re.compile(r'sldkjlsdjf').match: re_match(m)
    

    Or simplest yet:

    try_match = re.compile(r'sldkjlsdjf').match
    

    This saves not only the re compile time (which is actually cached internally in the re module anyway), but also the lookup of the '.match' method. In a busy function or a tight loop, those '.' resolutions can add up.