Search code examples
pythondjangodatabase-designglob

Django DB design to glob words quickly


I need to quickly look up words for a web application that I am writing in Django. I was thinking of putting each character of the word in an integerfield of its own, indexed by position.

class Word(models.Model):
    word = models.CharField(max_length=5)
    length = models.IntegerField()

    c0 = models.IntegerField(blank=True, null=True)
    c1 = models.IntegerField(blank=True, null=True)
    c2 = models.IntegerField(blank=True, null=True)
    c3 = models.IntegerField(blank=True, null=True)
    c4 = models.IntegerField(blank=True, null=True)

    mapping = [c0, c1, c2, c3, c4,]

    def save(self):
        self.length = len(self.word)
        for (idx, char) in enumerate(self.word):
            self.mapping[idx] = ord(char)
        models.Model.save(self)

Then I could make queries like Word.objects.filter(length=4, mapping[2]=ord('A')) to find all words of length four that have an A in the third position.

I'm not really sure about the design and some of the mechanics so I thought I would ask for suggestions here before I went and tried to implement it. I'm not entirely sure about the syntax for making queries.

So, I guess the questions would be

  1. Do you have any suggestions for the design?
  2. Would mapping[2] work?
  3. Would I be able to pass in a dictionary to the filter command so that I can have a variable number of keyword arguments?

Thanks!


Solution

  • Would mapping[2] work?

    No, it wouldn't.

    Would I be able to pass in a dictionary to the filter command so that I can have a variable number of keyword arguments?

    Certainly. For instance:

    conditions = dict(word__startswith = 'A', length = 5)
    Word.objects.filter(**conditions)
    

    would find all Word instances starting with A and are five characters long.

    Do you have any suggestions for the design?

    This feels to me is like a case of premature optimization. I suspect that for moderate volumes of data you should be able to combine a suitable database function with Django's filter to get what you need.

    For example:

    to find all words of length four that have an A in the third position.

    you can combine a Django filter with (Postgresql's) strpos.

    contains, position = 'A', 3
    where = ["strpos(word, '%s') = %s" % (contains, position)]
    Word.objects.filter(length = 4, word__contains = contains).extra(where = where)