Search code examples
javascriptautocompleteprototypejsscriptaculous

Prune Down Autocomplete Terms


I have a very large list of terms for use in an autocomplete box. I've been mulling over a few different scenarios for how to prune them down, but I haven't come up with anything great yet.

Basically, the structure is very similar to a record label -

  • An artist has albums An album has songs
  • Individual songs could be popular, albums are mostly sums of their underlying song popularity
  • Albums also have highly variable number of songs in them - so if an album has hundreds of song, it's very likely that someone would want to search for it, and much less likely if it has less songs
  • As the autocomplete becomes more specific (more letters), I'd like less likely terms to be shown

I'm thinking something like this:

Apple   10
Banana  10
Crab    20
Diner   30
Dish    20
Daily   10
Diver   20
Dice    10

If this is the list of albums, and the "score" i assign them, I simply pop choose the list based on the length of the list I'm showing (3 for example) and then by score - I hit "D" above, and "Diner", "Dish" and "Diver" show up, and then "i" and it becomes "Diner", "Dish" and "Diver".

Is there a particular algorithm that does this? Or an AJAX autocompleter built for this? I'm currently using Prototype/Scriptaculous but I can't seem to get it right.


Solution

  • This is not an easy algorithm to implement, since you are trying to index a data structure in two ways - lexicographically and by popularity.

    One way to do it might be to build a compressed trie of the songs, where at each node you store a pre-built list of the N most popular songs beginning with that prefix. This would take a lot of storage (O(NUM_SONGS * N)), but would allow fast lookup (O(PREFIX)).