Search code examples
javaspeech-recognitionspeech-to-textcmusphinx

can i use google speech recognition api in my desktop application


I want to know whether i can use speech recognition api of google for my desktop application. I have seen some example in which i have to convert the speech to a file and send to a url. But it will be little cumbersome task because in my application the user have to continuously submit his voice. So is there any other alternative to use google speech api. I am least interested to go with sphinx because its accuracy is very less and i dont know how to add new words in the dictionary and without adding it to dictionary it wont recognize new words. Any help would be appreciated.


Solution

  • Are you referring to ambient listening? I am actually working on some Voice Activity Detection algorithm with the Google Speech Recognition API. Although I haven't finished the algorithm yet, I've added a volume and frequency calculator so that you don't have to send requests to Google when the person is not talking. Here is the link to the source code.

    https://github.com/The-Shadow/java-speech-api

    (This isn't what I use, but it's simplistic. You can also add frequency threshold holds and stuff. I threw this code together so no guarantee it will work look at the example branch of the API.)

    //package recognitionprocess;
    //import org.jaudiotagger.audio.*;
    
    
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.RandomAccessFile;
    
    import javax.sound.sampled.AudioFileFormat;
    
    import com.darkprograms.speech.recognizer.GoogleResponse;
    import com.darkprograms.speech.recognizer.Recognizer;
    
    public class RecognitionMain {
    
        public static void main(String[] args)  {
            try{
            ambientListening();
            }
            catch(Exception e){
                e.printStackTrace();
            }
        }
    
        private static void ambientListening() throws Exception{
    
            String filename = "tarunaudio.wav";//Your Desired FileName
            MicrophoneAnalyzer mic = new MicrophoneAnalyzer(AudioFileFormat.Type.WAVE);
           mic.open();
            mic.captureAudioToFile(filename);
            final int THRESHOLD = 10;//YOUR THRESHOLD VALUE.
            int ambientVolume = mic.getAudioVolume();//
            int speakingVolume = -2;
            boolean speaking = false;
                for(int i = 0; i<1||speaking; i++){
                    int volume = mic.getAudioVolume();
                    System.out.println(volume);
                    if(volume>ambientVolume+THRESHOLD){
                        speakingVolume = volume;
                        speaking = true;
                        Thread.sleep(1000);
                        System.out.println("SPEAKING");
                    }
                    if(speaking && volume+THRESHOLD<speakingVolume){
                         break;
                    }
                    Thread.sleep(200);//Your refreshRate
                }
                  mic.close();
                //You can also measure the volume across the entire file if you want
                //to be resource intensive.
                if(!speaking){
                     ambientListening();
                }
            Recognizer rec = new Recognizer(Recognizer.Languages.ENGLISH_US);
            GoogleResponse out = rec.getRecognizedDataForWave(filename);
            System.out.println(out.getResponse());
            ambientListening();
        }
    }