Search code examples
javaandroidspeech-recognitionvoice-recognitionspeech-to-text

Spell letter by letter


I'm building an Android app that will get user input by using voice to Text.

The user is going to enter codes just like 'AA001', 'BC022', 'AD011' and so on. I am already able to open the voice Recognition Activity and get user input from it (snippet below, from here), but it returns words.

I need a way to set it up in order to get just the letters and numbers the user had really entered.

private void promptSpeechInput() {
    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
    intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
            getString(R.string.speech_prompt));
    try {
        startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
    } catch (ActivityNotFoundException a) {
        Toast.makeText(getApplicationContext(),
                getString(R.string.speech_not_supported),
                Toast.LENGTH_SHORT).show();
    }
}

Solution

  • You can not do it with Google engine. You can do it with other engines like CMUSphinx. There you can specify a grammar to recognize only letters and digits, the grammar should look like this:

    #JSGF V1.0;
    grammar alphadigits;
    public <letters> = (one | two | three | four | ... | a. | b. | c. | d. ... | z.)*;
    

    Such grammar will return you only digits with higher accuracy than Google native API.

    For better recognition accuracy it is also recommended to add letter combinations into the grammar instead of letter. For example if you want "aa", then add "aa" to the grammar. Letters are too short to be recognized reliably.