I am creating a script which will synchronize two databases. There is a data in the database which should be stored as a tree so I use django-mptt for the new DB. When I syncing DB's I select new data from the old DB and should save it in the new one.
I want to know if there is a better way to add new nodes into a tree? Now it looks next way:
...
# Add new data to DB
for new_record in new_records:
# Find appropriate parent using data in 'new_record'
parent = get_parent(new_record)
# Create object which should be added using data in 'new_record'
new_node = MyMPTTModel(...)
new_node.insert_at(parent, save = True)
# Similar to:
# new_node.insert_at(parent, save = False)
# new_node.save()
But it works very slow. I think it works in a such way because after each call of the insert_at(..., save = True)
method django-mptt
should write new node to the DB and modify left
and right
keys for records which are already in the DB.
Is there any way to make django-mptt
modify a query each time I call insert_at
and then apply all changes together when I call save
? Or do you know any other ways how to reduce execution time?
Thanks in advance.
Firstly, don't use insert_at
. It's not the reason for slow performance, but it's unnecessary and looks ugly. Just set node.parent
:
for new_record in new_records:
new_node = MyMPTTModel(..., parent=get_parent(new_record))
new_node.save()
Now for the performance question. If you're using the latest mptt (git master, not 0.5.4), there's a context manager called delay_mptt_updates
to prevent mptt from doing a lot of these updates until you've added all the nodes:
with transaction.atomic():
with MyMPTTModel.objects.delay_mptt_updates():
for new_record in new_records:
new_node = MyMPTTModel(..., parent=get_parent(new_record))
new_node.save()
Alternatively if you're touching almost the entire tree, you can speed things up even more by using disable_mptt_updates
and rebuild the whole tree at the end:
with transaction.atomic():
with MyMPTTModel.objects.disable_mptt_updates():
for new_record in new_records:
new_node = MyMPTTModel(..., parent=get_parent(new_record))
new_node.save()
MyMPTTModel.objects.rebuild()