Search code examples
javacmusphinxpocketsphinx

Why is pocketsphinx returning a null hypothesis via java with kws? Works via commandline, not via code


I've been working with pocketsphinx in java. I've pieced this together from various sources.

Trying to do keyword detection via pocketsphinx.

As I stated works via command line:

pocketsphinx_continuous -inmic  yes -kws keyphrase.list

Where the keyphrase.list file contains:\

abomination /le-20/

I get a hit every time.

Here is my java code:

( I've tried le-1 up through le-40 )

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.TargetDataLine;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.Arrays;

import edu.cmu.pocketsphinx.Decoder;
import edu.cmu.pocketsphinx.Config;
import edu.cmu.pocketsphinx.Hypothesis;

public class Controller {
    static {
        System.loadLibrary("pocketsphinx_jni");
    }

    private static ByteArrayOutputStream out;

    public static void main(String args[]) {

        AudioFormat format = new AudioFormat(44100, 16, 1, true, true);
        TargetDataLine targetLine = null;
        DataLine.Info targetInfo = new DataLine.Info(TargetDataLine.class, format);
        boolean running = true;


        try {

            targetLine = AudioSystem.getTargetDataLine(format);
            targetLine.open();
            out = new ByteArrayOutputStream();
            int numBytesRead;
            byte[] data = new byte[targetLine.getBufferSize() / 5];


            Config c = Decoder.defaultConfig();
            c.setString("-hmm", "/usr/local/share/pocketsphinx/model/en-us/en-us/");
            //c.setString("-lm", "/usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin");
            c.setString("-dict", "/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict");
            c.setString("-keyphrase", "abomination");
            c.setFloat("-kws_threshold", 1e-1);

            Decoder d = new Decoder(c);
            d.setRawdataSize(300000);

            targetLine.start();
            System.out.println("Recorder started");

            byte[] b = new byte[4096];

            d.startUtt();

            System.out.println("Decoder started");

            while ((running)) {
                int nbytes;
                short[] s = null;
                nbytes = targetLine.read(b,0,b.length);

                ByteBuffer bb = ByteBuffer.wrap(b, 0, nbytes);
                s = new short[nbytes/2];

                bb.asShortBuffer().get(s);

                d.processRaw(s, nbytes/2, false, false);
                d.setKws("abomination", );

                if (nbytes > 0) {

                    Hypothesis hypothesis = d.hyp();
                    if (hypothesis != null) {
                        System.out.println("------------------------------------------------------");
                        System.out.println(hypothesis.getHypstr());
                        System.out.println("------------------------------------------------------");

                        d.endUtt();
                        d.startUtt();
                    }
                }
            }

        }
        catch (Exception e) {
            System.err.println(e);
        }
    }
}

Code runs fine. Just never enters if if(hypothesis != null) for some reason.

This is what is logged:

INFO: cmn_prior.c(99): cmn_prior_update: from < 40.00  3.00 -1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 51.22 14.61 -8.72 -0.31 -3.49  0.18 -7.35  8.43 -0.77  7.64  1.41  0.27 -1.82 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 51.22 14.61 -8.72 -0.31 -3.49  0.18 -7.35  8.43 -0.77  7.64  1.41  0.27 -1.82 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 51.80 15.37 -8.77 -0.62 -2.74  0.09 -6.18 10.24  0.14  7.79  2.59  1.86 -3.22 >

UPDATE: more info.

This is the output when run. -kws is not set.

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us//feat.params
Current configuration:
[NAME]          [DEFLT]     [VALUE]
-agc            none        none
-agcthresh      2.0     2.000000e+00
-allphone               
-allphone_ci        no      no
-alpha          0.97        9.700000e-01
-ascale         20.0        2.000000e+01
-aw         1       1
-backtrace      no      no
-beam           1e-48       1.000000e-48
-bestpath       yes     yes
-bestpathlw     9.5     9.500000e+00
-ceplen         13      13
-cmn            current     current
-cmninit        8.0     40,3,-1
-compallsen     no      no
-debug                  0
-dict                   /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase       no      no
-dither         no      no
-doublebw       no      no
-ds         1       1
-fdict                  
-feat           1s_c_d_dd   1s_c_d_dd
-featparams             
-fillprob       1e-8        1.000000e-08
-frate          100     100
-fsg                    
-fsgusealtpron      yes     yes
-fsgusefiller       yes     yes
-fwdflat        yes     yes
-fwdflatbeam        1e-64       1.000000e-64
-fwdflatefwid       4       4
-fwdflatlw      8.5     8.500000e+00
-fwdflatsfwin       25      25
-fwdflatwbeam       7e-29       7.000000e-29
-fwdtree        yes     yes
-hmm                    /usr/local/share/pocketsphinx/model/en-us/en-us/
-input_endian       little      little
-jsgf                   
-keyphrase              abomination
-kws                    
-kws_delay      10      10
-kws_plp        1e-1        1.000000e-01
-kws_threshold      1       1.000000e-20
-latsize        5000        5000
-lda                    
-ldadim         0       0
-lifter         0       22
-lm                 
-lmctl                  
-lmname                 
-logbase        1.0001      1.000100e+00
-logfn                  
-logspec        no      no
-lowerf         133.33334   1.300000e+02
-lpbeam         1e-40       1.000000e-40
-lponlybeam     7e-29       7.000000e-29
-lw         6.5     6.500000e+00
-maxhmmpf       30000       30000
-maxwpf         -1      -1
-mdef                   
-mean                   
-mfclogdir              
-min_endfr      0       0
-mixw                   
-mixwfloor      0.0000001   1.000000e-07
-mllr                   
-mmap           yes     yes
-ncep           13      13
-nfft           512     512
-nfilt          40      25
-nwpen          1.0     1.000000e+00
-pbeam          1e-48       1.000000e-48
-pip            1.0     1.000000e+00
-pl_beam        1e-10       1.000000e-10
-pl_pbeam       1e-10       1.000000e-10
-pl_pip         1.0     1.000000e+00
-pl_weight      3.0     3.000000e+00
-pl_window      5       5
-rawlogdir              
-remove_dc      no      no
-remove_noise       yes     yes
-remove_silence     yes     yes
-round_filters      yes     yes
-samprate       16000       1.600000e+04
-seed           -1      -1
-sendump                
-senlogdir              
-senmgau                
-silprob        0.005       5.000000e-03
-smoothspec     no      no
-svspec                 0-12/13-25/26-38
-tmat                   
-tmatfloor      0.0001      1.000000e-04
-topn           4       4
-topn_beam      0       0
-toprule                
-transform      legacy      dct
-unit_area      yes     yes
-upperf         6855.4976   6.800000e+03
-uw         1.0     1.000000e+00
-vad_postspeech     50      50
-vad_prespeech      20      20
-vad_startspeech    10      10
-vad_threshold      2.0     2.000000e+00
-var                    
-varfloor       0.0001      1.000000e-04
-varnorm        no      no
-verbose        no      no
-warp_params                
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29       7.000000e-29
-wip            0.65        6.500000e-01
-wlen           0.025625    2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us//transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//means
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//variances
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us//sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(835): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 32 bytes (4331 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us//noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)
Recorder started
Decoder started

I found reference to why you don't need the -lm line here.

Also if you intend to use kws there is no need to use -lm in arguments. You need to remove:

"-lm", ".../model/hub4wsj_sc_8k_adapt/etc/hub4.5000.DMP",

That answers that.

If I change the above code and remove:

c.setString("-keyphrase", "abomination");

and add:

c.setString("-kws", "/home/pennyworth/keyphrase.list");

Now the output shows -kws set

-kws                    /home/pennyworth/keyphrase.list

and I get this in the output:

INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)

Still though, null result.

I

NFO: cmn_prior.c(99): cmn_prior_update: from < 73.10 11.10 -10.49  1.23  0.67 -1.37 -5.29  5.17 -0.62  3.91 -0.28  2.56 -2.14 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 73.59 11.36 -9.61  2.41  2.13  0.15 -5.09  4.30  1.03  4.46 -0.13  3.36 -0.62 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 73.59 11.36 -9.61  2.41  2.13  0.15 -5.09  4.30  1.03  4.46 -0.13  3.36 -0.62 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 74.65 10.58 -9.76  4.47  3.63  1.19 -5.20  3.74  2.33  4.75 -0.11  3.06 -0.32 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 74.65 10.58 -9.76  4.47  3.63  1.19 -5.20  3.74  2.33  4.75 -0.11  3.06 -0.32 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 77.49 10.99 -8.80  5.45  4.37  2.83 -4.14  4.06  3.48  5.07 -0.41  2.82 -0.35 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 77.49 10.99 -8.80  5.45  4.37  2.83 -4.14  4.06  3.48  5.07 -0.41  2.82 -0.35 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 73.54  9.62 -11.34  3.19  3.30  2.24 -6.61  4.52  1.31  5.99 -1.28  2.24 -0.96 >

Am I assuming wrong that kws doesn't return a hyp? There is a great python example but nada for java regarding kws.

https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py

Here is the api doc for pocketsphinx, http://cmusphinx.sourceforge.net/doc/pocketsphinx/pocketsphinx_8c_source.html

I don't know how to move forward with this. I'm either not setting up the decoder correctly or something else is happening and thats why I'm getting the null return.

I'm not clear on the use of -kws vs -keyphrase vx -kws-threshold. Does using -kws mean you don't need the other two since it effectively sets both the phrase and threshold?

Updated code. Added byteorder. ( And made sure my 1 wasn't an l )

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.TargetDataLine;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.Arrays;

import edu.cmu.pocketsphinx.Decoder;
import edu.cmu.pocketsphinx.Config;
import edu.cmu.pocketsphinx.Hypothesis;

public class Controller {
    static {
        System.loadLibrary("pocketsphinx_jni");
    }

    private static ByteArrayOutputStream out;

    public static void main(String args[]) {

        AudioFormat format = new AudioFormat(44100, 16, 1, true, true);
        TargetDataLine targetLine = null;
        DataLine.Info targetInfo = new DataLine.Info(TargetDataLine.class, format);
        boolean running = true;


        try {

            targetLine = AudioSystem.getTargetDataLine(format);
            targetLine.open();
            out = new ByteArrayOutputStream();
            int numBytesRead;
            byte[] data = new byte[targetLine.getBufferSize() / 5];


            Config c = Decoder.defaultConfig();
            c.setString("-hmm", "/usr/local/share/pocketsphinx/model/en-us/en-us/");
            c.setString("-dict", "/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict");
            c.setString("-keyphrase", "abomination");
            c.setFloat("-kws_threshold", 1e-20);
            //c.setString("-kws", "/home/bruce/keyphrase.list");


            Decoder d = new Decoder(c);
            d.setRawdataSize(300000);

            targetLine.start();
            System.out.println("Recorder started");

            byte[] b = new byte[4096];

            d.startUtt();

            System.out.println("Decoder started");

            while ((running)) {
                int nbytes;
                short[] s = null;
                nbytes = targetLine.read(b,0,b.length);

                ByteBuffer bb = ByteBuffer.wrap(b, 0, nbytes);
                s = new short[nbytes/2];

                bb.asShortBuffer().get(s);
                bb.order(ByteOrder.LITTLE_ENDIAN);
                d.processRaw(s, nbytes/2, false, false);


                if (nbytes > 0) {

                    Hypothesis hypothesis = d.hyp();
                    if (hypothesis != null) {
                        System.out.println("------------------------------------------------------");
                        System.out.println(hypothesis.getHypstr());
                        System.out.println("------------------------------------------------------");

                        d.endUtt();
                        d.startUtt();
                    }
                }
            }

        }
        catch (Exception e) {
            System.err.println(e);
        }
    }
}

Here is the updated output:

INFO: cmn_prior.c(99): cmn_prior_update: from < 67.54 12.23 -8.09 -0.29  0.56 -0.37 -3.25  7.00 -1.97  3.98 -1.87  3.63 -1.49 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 66.41 12.98 -8.67 -0.63  1.35 -0.13 -3.16  7.97 -3.11  3.57 -0.74  3.91 -1.79 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 66.41 12.98 -8.67 -0.63  1.35 -0.13 -3.16  7.97 -3.11  3.57 -0.74  3.91 -1.79 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 66.66 12.12 -10.32 -1.55  0.57 -0.01 -3.20  8.83 -2.47  4.65  0.07  4.54 -2.35 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 66.66 12.12 -10.32 -1.55  0.57 -0.01 -3.20  8.83 -2.47  4.65  0.07  4.54 -2.35 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 68.35 12.82 -9.91 -1.85  0.77  0.18 -2.25  9.05 -1.75  3.84  0.66  5.82 -2.50 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 68.35 12.82 -9.91 -1.85  0.77  0.18 -2.25  9.05 -1.75  3.84  0.66  5.82 -2.50 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 64.05 14.55 -8.14 -0.36  0.64  0.75 -1.96  9.76 -0.03  5.26  1.16  5.03 -1.68 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 64.05 14.55 -8.14 -0.36  0.64  0.75 -1.96  9.76 -0.03  5.26  1.16  5.03 -1.68 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 63.08 15.30 -8.96  0.76  1.05  0.83 -1.40 10.94 -0.69  4.52 -0.80  3.58 -3.18 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 63.08 15.30 -8.96  0.76  1.05  0.83 -1.40 10.94 -0.69  4.52 -0.80  3.58 -3.18 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 62.15 16.49 -10.32 -0.25  1.14 -0.32 -2.32 10.95 -2.12  2.91 -1.31  2.57 -4.05

Update:

Changed order of the following to reflect https://github.com/cmusphinx/pocketsphinx/blob/master/swig/java/test/DecoderTest.java

        bb.order(ByteOrder.LITTLE_ENDIAN);
        bb.asShortBuffer().get(s);

This produces no output at all once started as if it gets no input.

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us//feat.params
Current configuration:
[NAME]          [DEFLT]     [VALUE]
-agc            none        none
-agcthresh      2.0     2.000000e+00
-allphone               
-allphone_ci        no      no
-alpha          0.97        9.700000e-01
-ascale         20.0        2.000000e+01
-aw         1       1
-backtrace      no      no
-beam           1e-48       1.000000e-48
-bestpath       yes     yes
-bestpathlw     9.5     9.500000e+00
-ceplen         13      13
-cmn            current     current
-cmninit        8.0     40,3,-1
-compallsen     no      no
-debug                  0
-dict                   /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase       no      no
-dither         no      no
-doublebw       no      no
-ds         1       1
-fdict                  
-feat           1s_c_d_dd   1s_c_d_dd
-featparams             
-fillprob       1e-8        1.000000e-08
-frate          100     100
-fsg                    
-fsgusealtpron      yes     yes
-fsgusefiller       yes     yes
-fwdflat        yes     yes
-fwdflatbeam        1e-64       1.000000e-64
-fwdflatefwid       4       4
-fwdflatlw      8.5     8.500000e+00
-fwdflatsfwin       25      25
-fwdflatwbeam       7e-29       7.000000e-29
-fwdtree        yes     yes
-hmm                    /usr/local/share/pocketsphinx/model/en-us/en-us/
-input_endian       little      little
-jsgf                   
-keyphrase              abomination
-kws                    
-kws_delay      10      10
-kws_plp        1e-1        1.000000e-01
-kws_threshold      1       1.000000e-20
-latsize        5000        5000
-lda                    
-ldadim         0       0
-lifter         0       22
-lm                 
-lmctl                  
-lmname                 
-logbase        1.0001      1.000100e+00
-logfn                  
-logspec        no      no
-lowerf         133.33334   1.300000e+02
-lpbeam         1e-40       1.000000e-40
-lponlybeam     7e-29       7.000000e-29
-lw         6.5     6.500000e+00
-maxhmmpf       30000       30000
-maxwpf         -1      -1
-mdef                   
-mean                   
-mfclogdir              
-min_endfr      0       0
-mixw                   
-mixwfloor      0.0000001   1.000000e-07
-mllr                   
-mmap           yes     yes
-ncep           13      13
-nfft           512     512
-nfilt          40      25
-nwpen          1.0     1.000000e+00
-pbeam          1e-48       1.000000e-48
-pip            1.0     1.000000e+00
-pl_beam        1e-10       1.000000e-10
-pl_pbeam       1e-10       1.000000e-10
-pl_pip         1.0     1.000000e+00
-pl_weight      3.0     3.000000e+00
-pl_window      5       5
-rawlogdir              
-remove_dc      no      no
-remove_noise       yes     yes
-remove_silence     yes     yes
-round_filters      yes     yes
-samprate       16000       1.600000e+04
-seed           -1      -1
-sendump                
-senlogdir              
-senmgau                
-silprob        0.005       5.000000e-03
-smoothspec     no      no
-svspec                 0-12/13-25/26-38
-tmat                   
-tmatfloor      0.0001      1.000000e-04
-topn           4       4
-topn_beam      0       0
-toprule                
-transform      legacy      dct
-unit_area      yes     yes
-upperf         6855.4976   6.800000e+03
-uw         1.0     1.000000e+00
-vad_postspeech     50      50
-vad_prespeech      20      20
-vad_startspeech    10      10
-vad_threshold      2.0     2.000000e+00
-var                    
-varfloor       0.0001      1.000000e-04
-varnorm        no      no
-verbose        no      no
-warp_params                
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29       7.000000e-29
-wip            0.65        6.500000e-01
-wlen           0.025625    2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us//transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//means
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//variances
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us//sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(835): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 32 bytes (4331 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us//noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
Recorder started
Decoder started
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)

Solution

  • abomination /le-20/
    

    Every developer should know floating point format, it starts with numbers - 1e-1 is the same as 0.1, 5e-3 is the same as 0.005. le-20 with l is not a floating point number.

                d.processRaw(s, nbytes/2, false, false);
                d.setKws("abomination", );
    

    This code should not compile. Also, you should not set search during processing.

    I'm not clear on the use of -kws vs -keyphrase vx -kws-threshold. Does using -kws mean you don't need the other two since it effectively sets both the phrase and threshold?

    Yes, keyword list replaces command line option

    Am I assuming wrong that kws doesn't return a hyp? T

    It does not return hypothesis because you do not pass proper audio into decoder. You need

    bb.order(ByteOrder.LITTLE_ENDIAN);`
    

    without it it passes big-endian data (java default). Swig example in sources has that.

    INFO: cmn_prior.c(99): cmn_prior_update: from < 73.10 11.10 -10.49 1.23 0.67 -1.37 -5.29 5.17 -0.62 3.91 -0.28 2.56 -2.14 >

    CMN range of 70 indicates endian problems. With proper endian first CMN value should be from 30 to 60.