Search code examples
pythonpeewee

Bulk inserting on table with foreign key field


Given the following models,

class Server(BaseModel):
    name = peewee.CharField(unique=True)

class Member(BaseModel):
    name = peewee.CharField(unique=True)
    server = peewee.ForeignKeyField(Server, related_name='members')

and a dictionary with keys being Server names and values being tuples of Member names,

data = {
    'Server01': ('Laurence', 'Rose'),
    'Server02': ('Rose', 'Chris'),
    'Server03': ('Isaac',)
}

what is the fastest way of bulk inserting Members using peewee? It seems like one should use Model.insert_many() here, but since Member.server expects a Server or Server.id, that would require iterating over data.items() and selecting a Server for each name.

for server_name, member_names in data.items():
    server = Server.select().where(Server.name == server_name)
    member_data = []
    for name in member_names:
        member_data.append({'name': name, 'server': server})

    with db.atomic():
        Member.insert_many(member_data)

Needless to say, this is terribly inefficient. Is there a better way of doing it?


Solution

  • Well, if you don't know ahead of time which servers are present in the DB it seems like your problem is the data-structure you're using. Keeping server_name -> member_names in a dict like that and trying to insert it all in one go is not how relational databases work.

    Try this:

    server_to_id = {}
    for server_name in data:
        if server_name not in server_to_id:
            server = Server.create(name=server_name)
            server_to_id[server_name] = server.id
    
    for server_name, member_names in data.items():
        server_id = server_to_id[server_name]
        member_data = [{'name': name, 'server': server_id} for name in member_names]
        Member.insert_many(member_data).execute()
    

    Note: don't forget to call .execute() when using insert() or insert_many().