Search code examples
javaalgorithmsearchspelling

how to optimize a search algorithm?


so basically what i mean here is how do i make a search tool(like to search through a series of Strings, perhaps in an array or an arraylist, etc) that would be useful? not necessarily fast, but useful.

for example, how easy would it be to incorporate "autocorrect" where the term you're searching for yields results that are similar in spelling but not exact? or results that match the first 3 characters but not the whole word, or results that may include the word but are not necessarily comprised of the entire word? is there an API for this or a class or is there an algorithm that would help me out here?


Solution

  • in a nutshell, for SIMILAR strings, you can use the "Edit Distance" algorithm, that finds similarity (in fact it finds number of moves to turn one string into another, but that's a kind of similarity), and for the AUTOCOMPLETE tool, you can use "Trie" data structure, which works as a tree of chars, and as it reads the chars of the current word, it stops in a node that shows you where alse it can go to get to existent words. To SEARCH something that includes the word (string), I suppose you can use KMP algorithm (or Aho-Corasick, if you wish to search for more than one word within the whole text).

    https://en.wikipedia.org/wiki/Edit_distance

    https://en.wikipedia.org/wiki/Trie

    https://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

    https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm