Search code examples
pythondjangoormquery-optimizationdjango-queryset

How to operate on django query set without making next queries to database


I'm new in django (I'm working with this about 1.5 months). Now I'm working on django 1.6, so I'm looking for hotfix, but after ~month I would use django 1.9 (finally!).

My question is: how it is possible to make one/two bigger queries to database, no maaaaany small queries? I would like to gather all groups objects and then search for interesting me object or ID. With this schema of database, I have to make tens of queries to add ONE COMPLETE record. (each of data has ~10 parameters). I heard about 'prefab' thing in django 1.8. It will be my answer in next month? or mayby good usage of select_related?

I have to optimise this, because I can parse excel file with datas and create json under 2seconds, but update on database can take more than 10 minutes this way...

I'm reading all records lines json file and need to add it properly to this database, for example:

// Lowercase 'data' is json object representation from where i'm reading. Keys are in [] brackets.

allParametersDef = ParameterDef.objects.all().values_list("name", flat = True) 

for record in allRecords:
    group = Group.objects.get_or_create( name = data[record]['Group name'])[0]
    dataToSave = Data( gatheredBy = group, date = data[record]['Date'])
    dataToSave.save()

        for parameter in allParametersDef:
            newParam = ParameterDef.objects.create( 
                data = dataToSave,
                value = data[record][parameter]['Value'],
                definition = ParameterDef.objects.get( name = parameter),
                description = data[record][parameter]['Description']
            )
            newParam.save()

Data has many parameters (parameter is table with id, name, description and value). ParameterDef is general definition of parameter (defining for example units)

If you really wanna think about structure of database:

Table containing data objects:

data
    -> each data has one FK with id to group object (who gathered data)    
    -> few parameters, each has FK to data
        -> each parameter has FK to his data object 
        -> each parameter has value 
        -> each parameter has FK to his parameter definition
        -> each parameter has string with additional description

For example we have data object {'Group name' : 'Gryffindor', 'Date' : '01-01-2000' }

and this objects has nested few parameters:

{ data: [there is FK to data], value: 123, definition [there is FK to ParameterDef of temperature], description: 'Gathered at very windy and rainy day'}
{ data: [there is FK to data], value: 30, definition [there is FK to ParameterDef of humidity], description: 'It was raining about 70% of the day'}
{ data: [there is FK to data], value: 70, definition [there is FK to ParameterDef of breziness], description: ''}

Solution

  • I see you are creating multiple objects by doing model.objects.create. This could be done more efficient using bulk_create.

    You could store the objects in a list without calling create first like:

    new_param_list = []
    for parameter in allParametersDef:
        newParam = ParameterDef(data=dataToSave,
                                value=data[record][parameter]['Value'],
                                definition=ParameterDef.objects.get( name = parameter),
                                description = data[record][parameter]['Description'])
        new_param_list.append(newParam)
    
    ParameterDef.objects.bulk_create(new_param_list)
    

    Also, you don't call save() after create(). They are doing the same thing and the save() is not necessary.