Search code examples
pythonmysqlpeeweepymysql

Proper peewee setup for avoiding mysql sync issues when using Python3 multiprocessing?


Today I got a nasty surprise with a friend's project I've taken over. When attempting to write to a remote MySQL db it spat out this error:

peewee.OperationalError: (2014, 'Command Out of Sync')

On closer inspection, the program only runs connect() to the DB at the init phase, and only runs close() once the program is exiting. However, this is a multiprocessing Python app, and it is usually doing about 4 processes at a time, each reading and writing to our DB.

From my previous experience with MySQL, as well as research into this specific error, it seems that this occurs when two statements are executing simultaneously. Therefore, some rewriting would be necessary to segment the methods.

Most of the methods are contained within the Class they're interacting with in the db, ex:

Class Foo

...

def register_foo(self) -> Foo:

                saved = False
                attempts = 0


                while not saved:

                        if attempts >= 20:

                                print('error: could not register foo!')
                                return None

                        try:
                                foo, created = Foo.get_or_create()

                        except Exception as e:

                                print(e)
                                sleep(2.5)
                                attempts += 1

                                continue

                        saved = True

                return foo

This doesn't seem right at all. I guess it kind of works around the problem sometimes, but I would prefer to just fix the issue with syncing to the db.

My question is what would be the proper approach to segmenting a Python3 script's peewee methods so that there are no sync issues when writing to a remote database?


Solution

  • On closer inspection, the program only runs connect() to the DB at the init phase, and only runs close() once the program is exiting. However, this is a multiprocessing Python app, and it is usually doing about 4 processes at a time, each reading and writing to our DB.

    What you need to do is connect to the database after forking the new process. That way the internal state (file descriptors, etc) is clean when the new process starts up.