Search code examples
djangosearchfilterdjango-q

Strange behavior with complex Q object filter queries in Django


Hi I am trying to write a tagging system for Django, but today I encountered a strange behavior in filter or the Q object (django.db.models.Q).

I wrote a function, that converts a search string into a Q object. The next step would be to filter the TaggedObject with these query. But unfortunately I get a strange behavior.

search for only one Tag element:

when I search (id=20) => Q: (AND: ('tags__tag__id', 20)) and it returns 2 Taged Objects with the ID 1127 and 132

when I search (id=4) => Q: (AND: ('tags__tag__id', 4)) and it returns also 2 Objects, but this time 1180 and 1127

here is the reluting SQL query:

SELECT "django_content_type"."id", "django_content_type"."name", "django_content_type"."app_label", "django_content_type"."model" 
FROM "django_content_type" 
WHERE ("django_content_type"."model" = slogan  AND "django_content_type"."app_label" = slogans ) 
ORDER BY "django_content_type"."name" ASC

SELECT "slogans_slogan"."id", "slogans_slogan"."headline", "slogans_slogan"."text", "slogans_slogan"."author"
FROM "slogans_slogan"
  INNER JOIN "htags_objecttagbridge" ON ("slogans_slogan"."id" = "htags_objecttagbridge"."object_id")
WHERE ("htags_objecttagbridge"."tag_id" = 4  AND "htags_objecttagbridge"."content_type_id" = 9 )
LIMIT 21

search for two tags with 'or' conjunction:

until here is everything fine, but when i make a little bit more complex query like (id=4) or (id=20) => Q: (OR: ('tags__tag__id', 4), ('tags__tag__id', 20)) then it returns 4(!) Objects 1180, 1127, 1127, 132

and the SQL:

SELECT "slogans_slogan"."id", "slogans_slogan"."headline", "slogans_slogan"."text", "slogans_slogan"."author"
FROM "slogans_slogan"
  INNER JOIN "htags_objecttagbridge" ON ("slogans_slogan"."id" = "htags_objecttagbridge"."object_id")
WHERE ((("htags_objecttagbridge"."tag_id" = 4 AND "htags_objecttagbridge"."content_type_id" = 9 ) OR "htags_objecttagbridge"."tag_id" = 20 ) AND "htags_objecttagbridge"."content_type_id" = 9 )
LIMIT 21

But the object with the ID 1127 is returned twice, but thats not the behaviour I want. Do I have to live with it, and uniqify that list or can I do something different. The representation of the Q object looks fine for me.

search for two tags 'and' conjunction

But the worst is now, when I search for (id=20) and (id=4) => Q: (AND: ('tags__tag__id', 20), ('tags__tag__id', 4)) then it returns no object at all. But why? The representation should be ok and the object with the id 1127 is tagged by both. What am I missing?

here is the SQL again:

SELECT "slogans_slogan"."id", "slogans_slogan"."headline", "slogans_slogan"."text", "slogans_slogan"."author" 
FROM "slogans_slogan"
  INNER JOIN "htags_objecttagbridge" ON ("slogans_slogan"."id" = "htags_objecttagbridge"."object_id")
WHERE ("htags_objecttagbridge"."tag_id" = 4  AND "htags_objecttagbridge"."content_type_id" = 9  AND "htags_objecttagbridge"."tag_id" = 20 )
LIMIT 21

[edit]: I now realize, that this SQL statement is wrong. At least not what I want, because here it wants, that one ObjectTagBridge has the id 4 and at the same time the id 20. But in my case these are 2 different

Here are also the relevant parts of the classes, that are involved:

class TaggedObject(models.Model):
    """
        class that represent a tagged object
    """
    tags = generic.GenericRelation('ObjectTagBridge',
                                   blank=True, null=True)

class ObjectTagBridge(models.Model):
    """
        Help to connect a generic object to a Tag.
    """
    # pylint: disable-msg=W0232,R0903
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    content_object = generic.GenericForeignKey('content_type', 'object_id')
    tag = models.ForeignKey('Tag')

class Tag(models.Model):
    ...

Thanks for your help


Solution

  • For problem 1 (uniqueness): you're going to want to make your query distinct. The duplication is expected behavior for that type of query unless you make it distinct.

    For problem 2, you're probably hitting a subtle but important part of how querysets work. If you make a query like this:

    mymodel.objects.filter(tags__tag__id=4, tags__tag__id=5)
    

    You are querying for a model that has a single tag which has both id=4 and id=5, which is of course no tag. But if you instead query like this:

    mymodel.objects.filter(tags__tag__id=4).filter(tags__tag__id=5)
    

    you get models which have some tag somewhere with id=4 and some tag somewhere with id=5. The same will hold with Q objects -- they will need to be split into separate filter or exclude calls to not be referring to a single Tag relation. This is documented here.