Search code examples
djangomongodbbackbone.jstastypiedjango-mongodb-engine

Tastypie-nonrel, django, mongodb: too many nestings


I am developing a web application with django, backbone.js, tastypie and mongodb. In order to adapt tastypie and django to mongodb I am using django-mongodb-engine and tastypie-nonrel. This application has a model Project, which has a list of Tasks. So it looks like this:

   class Project(models.Model):
       user = models.ForeignKey(User)
       tasks = ListField(EmbeddedModelField('Task'), null=True, blank=True)

   class Task(models.Model):
       title = models.CharField(max_length=200)

Thanks to tastypie-nonrel, getting the list of task of a project is done in a simple way with a GET request at /api/v1/project/:id:/tasks/

Now I want to extend this Task model with a list of comments:

   class Task(models.Model):
       title = models.CharField(max_length=200)
       comments = ListField(EmbeddedModelField('Comment'), null=True, blank=True)

   class Comment(models.Model):
       text = models.CharField(max_length=1000)
       owner = models.ForeignKey(User)

The problem with this implementation is that tastypie-nonrel does not support another nesting, so is not possible to simple POST a comment to /api/v1/project/:id:/task/:id:/comments/

The alternative is to just make a PUT request of a Task to /api/v1/project/:id:/task/, but this would create problems if two users decide to add a comment to the same Task at the same time, as the last PUT would override the previous one.

The last option (aside from changing tastypie-nonrel) is to not embed Comment inside the Task and just hold the ForeignKey, so the request would go to /api/v1/Comment/. My question is if this breaks the benefits of using MongoDB (as it is needed cross queries)? Is there any better way of doing it?

I have little experience in any of the technologies of the stack, so it may be I am not focusing well the problem. Any suggestions are welcome.


Solution

  • It seems like you are nesting too much. That said, you can create custom methods/URL mappings for tastypie and then run your own logic instead of relying on "auto-magic" tastypie. If you are worried about the comment overriding issue, you need transactions anyway. Your code then should be robust enough to handle the behavior of a failed transaction, for example to retry. This would greatly throttle your writes for sure if you are constantly locking on a large object with many writers, however, which points to a design issue as well.

    One way you can mitigate this a bit is to write to intermediate source such as a task queue or redis, then dump in the comments as needed. It just depends on how reliable/durable your solution. A task queue would handle retries for failed transactions at least; with redis you can do something with pub/sub.

    You should consider a few things about your design IMO regarding MongoDB.

    • Avoid creating overly large monolithic objects. Although this is a benefit of Mongo, it depends on your usage. If you are for instance always returning your project as a top-level object, then as the tasks and comments grow, the network traffic alone will kill performance.

      Imagine a very contrived example in which the project specific data is 10k, each task alone is 5k, and each comment alone is 2k, if you have a project with 5 tasks, 10 comments per task, you are talking about 10k + 5*5k + 10*2k. For a very active project with lots of comments, this will be heavy sending across the network. You can do slice/projection queries to reconcile this issue, but with some limitations and implications.

    • A corollary to the above, structure your objects to your use cases. If you don't need to bring things back together, they can be in different collections. Just because you "think" you need to bring them back together, doesn't mean implementation wise they need to be retrieved in the same fetch (although this is normally ideal).

      Even if you do need everything in one use case/screen, another solution that may be possible in some designs is to load things in parallel, or even deferred via JavaScript after the page loads, using AJAX. For example, you could load the task info at the top, and then make an async call to load the comments separately, similar to how Disqus or Livefyre work as integrations in other sites. This could help resolve your nesting issue somewhat as you'd get rid of the task/project levels and simply store some IDs on each comment/record to be able to query between collections.

    • Keep in mind you may not want to retrieve all comments at once, and if you have a lot of comments, you may hit up against limitations of the size of one document. The size is bigger these days in recent versions of Mongo, however it usually doesn't make sense anyway to have a single record with a large amount of data, going back to the first item above.

    My recommendations are:

    1. Use transactions if you're concerned about losing comments.

    2. Add a task queue/redis/something durable if you're worried about competing writes and losing things as a result of #1. If not, ignore it. Is it the end of the world if you lose a comment?

    3. Consider restructuring particularly comments into a separate collection to ease your tastypie issues. Load things deferred or in parallel if needed.