Search code examples
cmusphinxsphinx4

Sphinx Voice Activity Detection


So I'm trying to write a simple program that will detect voice activity with a .wav file using the CMU Sphinx library.

So far, I have the following

SpeechClassifier s = new SpeechClassifier();

s.setPredecessor(dataSource);
Data d = s.getData();

while(d != null) {
    if(s.isSpeech()) {
        System.out.println("Speech is detected");
    }
    else {
        System.out.println("Speech has not been detected");
    }

    System.out.println();
    d = s.getData();
}

I get the output "Speech is not detected" but there is Speech in the audio file. It seems as if the getData function is not working the way I want it to. I want it to get the frames and then determine whether the frames (s.isSpeech()) contain speech or not.

I'm trying to have multiple outputs ("Speech is detected" vs "Speech is not detected") for each frame. How can I make my code better? Thanks!


Solution

  • You need to insert DataBlocker before SpeechClassifier:

     DataBlocker b = new DataBlocker(10); // means 10ms
     SpeechClassifier s = new SpeechClassifier(10, 0.003, 10, 0);
     b.setPredecessor(dataSource);
     s.setPredecessor(b);
    

    Then it will process 10 millisecond frames.