Search code examples
pythondjangotransactionsatomic

django transaction atomic.. is trust?


Have you ever seen me look at the code? i'm using django 1.8, django-rest-framework, mysql/innodb.

I think you can always get a different value and response. but... It was not.

'01500701.20040128100031383','01500701.20040128100031262' <-- like this!!!

Sometimes response to values elsewhere.(other, another client.. worker1, worker2, worker3..) Can I know why this happens?

[view.py]

class NewsUrlList(mixins.ListModelMixin, mixins.CreateModelMixin, generics.GenericAPIView):
    def get(self, request, *args, **kwargs):

        abab = self.check_urlqueuesize()
        logger.debug("NewsUrlList Get Response id=[%s], data=[%s]", request.user.username, abab)
        return Response(abab, status=status.HTTP_200_OK)

    def post(self, request, *args, **kwargs):
        .. blsh blah..

...

@transaction.atomic
def check_urlqueuesize(self):
    if r.llen("url_list") < 5:  # <- redis list
        with transaction.atomic():
            readydata = NewsUrl.objects.filter(status='R')[:100]
            for a in readydata:
                r.rpush("url_list", a.link)  # <- redis list
                a.status = 'W'
                a.save(update_fields=['status'])

        responsjson = {}
        list1 = []

        for c in range(0, 7):
            list1.append(r.lpop("url_list").decode("utf-8"))

        responsjson["urls"] = list1
        return responsjson

[model.py]

class NewsUrl(models.Model):
    link = models.CharField(max_length=100, primary_key=True)
    title = models.TextField(default='')
    publisher = models.CharField(max_length=150, blank=True, default='')
    status = models.CharField(max_length=1, default='R')  # R:Ready, W:Working, D:Done, E:Error
    created = models.DateTimeField(auto_now_add=True)
    updated = models.DateTimeField(auto_now=True)

[log]

2015-12-10 15:42:42,941 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040128100031262', '01500701.20040128100031263', '01500701.20040128100031266', '01500701.20040128100031442', '01500701.20040128100031441', '01500701.20040128100031439', '01500701.20040128100031438'], 'size': 7}]
2015-12-10 15:42:54,639 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker2], data=[{'urls': ['01500701.20040128100031440', '01500701.20040128100031437', '01100201.20040128K1401.bak', '01100201.20040128K3602.bak', '01500701.20040128100031353', '01500701.20040128100031366', '01500701.20040128100031310'], 'size': 7}]
2015-12-10 15:42:57,148 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040128100031355', '01500701.20040128100031408', '01500701.20040128100031409', '01500701.20040128100031410', '01500701.20040128100031309', '01500701.20040128100031271', '01500701.20040128100031411'], 'size': 7}]
2015-12-10 15:42:57,555 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040128100031401', '01500701.20040128100031417', '01500701.20040128100031434', '01500701.20040128100031435', '01500701.20040128100031394', '01500701.20040128100031368', '01500701.20040128100031278'], 'size': 7}]
2015-12-10 15:43:08,069 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker2], data=[{'urls': ['01500701.20040128100031367', '01500701.20040128100031369', '01500701.20040128100031373', '01500701.20040128100031414', '01500701.20040128100031358', '01500701.20040128100031428', '01500701.20040128100031381'], 'size': 7}]
2015-12-10 15:43:09,262 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040128100031376', '01500701.20040128100031271', '01500701.20040128100031356', '01500701.20040128100031400', '01500701.20040128100031398', '01500701.20040128100031399', '01500701.20040128100031430'], 'size': 7}]
2015-12-10 15:43:09,731 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040128100031431', '01500701.20040128100031433', '01500701.20040128100031429', '01500701.20040128100031432', '01500701.20040128100031421', '01500701.20040128100031420', '01500701.20040128100031427'], 'size': 7}]
2015-12-10 15:43:24,308 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040128100031454', '01500701.20040128100031453', '01500701.20040128100031276', '01500701.20040128100031251', '01500701.20040128100031252', '01500701.20040128100031451', '01500701.20040128100031452'], 'size': 7}]
2015-12-10 15:43:24,686 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040128100031271', '01500701.20040128100031270', '01500701.20040128100031272', '01500701.20040128100031254', '01500701.20040128100031456', '01500701.20040128100031255', '01500701.20040128100031256'], 'size': 7}]
2015-12-10 15:43:33,765 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker2], data=[{'urls': ['01500701.20040128100031257', '01500701.20040128100031262', '01500701.20040128100031260', '01100401.20040226D1901.bak', '01100201.20040226K0401.bak', '01500701.20040226100036237', '01500701.20040127100031057'], 'size': 7}]
2015-12-10 15:43:40,245 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040127100031150', '01500701.20040127100031059', '01500701.20040127100031149', '01500701.20040127100031248', '01500701.20040127100031249', '01500701.20040127100031247', '01500701.20040127100031058'], 'size': 7}]
2015-12-10 15:43:41,202 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040127100031216', '01500701.20040127100031119', '01500701.20040127100031123', '01500701.20040127100031120', '01500701.20040127100031113', '01500701.20040127100031115', '01500701.20040127100031117'], 'size': 7}]
2015-12-10 15:43:46,030 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker2], data=[{'urls': ['01500701.20040127100031118', '01500701.20040127100031116', '01500701.20040127100031121', '01500701.20040127100031151', '01500701.20040127100031152', '01500701.20040127100031158', '01500701.20040127100031155'], 'size': 7}]
2015-12-10 15:43:52,295 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040128100031262', '01500701.20040127100031157', '01500701.20040127100031153', '01500701.20040127100031154', '01500701.20040127100031162', '01500701.20040127100031122', '01500701.20040127100031075'], 'size': 7}]
2015-12-10 15:43:53,130 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040127100031076', '01500701.20040127100031183', '01500701.20040127100031185', '01500701.20040127100031056', '01500701.20040127100031177', '01500701.20040127100031193', '01500701.20040127100031190'], 'size': 7}]
2015-12-10 15:43:59,004 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker2], data=[{'urls': ['01500701.20040127100031053', '01500701.20040127100031050', '01500701.20040127100031168', '01500701.20040127100031055', '01500701.20040127100031173', '01500701.20040128100031262', '01500701.20040127100031175'], 'size': 7}]
2015-12-10 15:44:05,372 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker1], data=[{'urls': ['01500701.20040127100031171', '01500701.20040127100031200', '01500701.20040127100031070', '01500701.20040127100031078', '01500701.20040127100031079', '01500701.20040127100031080', '01500701.20040127100031218'], 'size': 7}]
2015-12-10 15:44:05,825 views.py [19993]: DEBUG NewsUrlList Get Response id=[worker3], data=[{'urls': ['01500701.20040127100031093', '01500701.20040127100031091', '01500701.20040127100031090', '01500701.20040127100031089', '01500701.20040128100031383', '01500701.20040127100031092', '01500701.20040127100031068'], 'size': 7}]

Solution

  • Your question is a bit hard to understand, but what I think you're saying is that transaction.atomic() doesn't seem to be preventing the return of duplicate rows from the database.

    The short answer is that SQL databases support different transaction isolation levels, and the most common ones will indeed not protect the code you've written. I don't know enough about MySQL/InnoDB to say for sure, but I suspect that whatever isolation level you're using makes it possible for two concurrent processes to read the same R rows before the changes to W have been committed.

    The usual solution is to switch to a different isolation level (e.g. SERIALIZABLE) or to use explicit row locking. In either case you'll have to do some more research on your database and its configuration.