Search code examples
javakeyboardlevenshtein-distanceneighbours

How to check efficiently if two characters are neighbours on the keyboard?


I want to develop a soft keyboard for Android and already got a autocorrect algorithm which makes suggestions based on the fact if the input character and the character of a word from the dictionary are neighbours on the keyboard. This works in combination with the levenshtein-algorithm (if a character has to be replaced with a different character it is checked if they are neighbours). That's why this check is called very frequently. Currently, it consumes 50% of the time spent on autocorrection.

My current approach is a seperate trie with 3 layers. First layer: first character. Second layer: second character: Third layer: boolean holding the information if the characters are neighbours. But I'm afraid a trie is overkill? The intern hashmaps for every children may slow it down, as well? Should I build a hashmap with an own charToNumber-function?

How would you do this? Which bottlenecks can be avoided? Character.toLowerCase() seems to be inefficient as well when it's called everytime a check is performed.

I hope you can help me speeding up the task :)


Solution

  • What about assigning numbers to each key and use that to determine proximity.

        public static void main(String[] args) {
        double[] d = new double[26];
        d['q'-97] = 100d;
        d['w'-97] = 101d;
        d['e'-97] = 102d;
        d['r'-97] = 103d;
        d['t'-97] = 104d;
        //(optionally, put a space of 5 between right hand and left hand for each row)
        d['y'-97] = 105d;
        d['u'-97] = 106d;
        d['i'-97] = 107d;
        d['o'-97] = 108d;
        d['p'-97] = 109d;
    
    
        //my keyboard middle row is about 20% indented from first row
        d['a'-97] = 200.2;
        d['s'-97] = 201.2;
        d['d'-97] = 202.2;
        d['f'-97] = 203.2;
        d['g'-97] = 204.2;
        d['h'-97] = 205.2;
        d['j'-97] = 206.2;
        d['k'-97] = 207.2;
        d['l'-97] = 208.2;
    
        //third row is about 50% indented from middle row
        d['z'-97] = 300.5;
        d['x'-97] = 301.5;
        d['c'-97] = 302.5;
        d['v'-97] = 303.5;
        d['b'-97] = 304.5;
        d['n'-97] = 305.5;
        d['m'-97] = 306.5;
    
        for (char a = 'a'; a <= 'z'; a++) {
            for (char b = 'a'; b <= 'z'; b++)
                if (a != b && prox(a,b,d))
                    System.out.println(a + " and " + b + " are prox");
        }
    
    }
    
    static boolean prox(char a, char b, double m) {
        double a1 = m[a-97];
        double a2 = m[b-97];
    
        double d = Math.abs(a1-a2);
        //TODO: add in d == 5 if there is a spacing for left and right hand gap (since it's more unlikely of a crossover)
        return d == 0 || d == 1 || (d >= 99 && d <= 101);
    }
    

    Partial Output:

    a and q are prox
    a and s are prox
    a and w are prox
    a and z are prox
    ....
    g and b are prox
    g and f are prox
    g and h are prox
    g and t are prox
    g and v are prox
    g and y are prox   
    ....
    y and g are prox
    y and h are prox
    y and t are prox
    y and u are prox