I am trying to create a small Python script that will receive an audio stream over the network, feed it through pocketspinx to translate speech to text and run some commands depending on the output of pocketsphinx.
I've installed sphinxbase and pocketsphinx (5prealpha) on an Ubuntu 15.10 vm and am able to process the content of an example audiofile (part of the pocketsphinx installation) properly in Python. So I'm reasonably sure my sphinx install is working properly. Unfortunately the test python script cannot process continuous audio and uses the native pocketsphinx API. According to the cmusphinx website I should use gstreamer for continuous translation. Unfortunately the information on how to use pocketsphinx with gstreamer in Python is rather limited. Based on the examples I could find I pieced together the following script.
import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
GObject.threads_init()
Gst.init(None)
def element_message( bus, msg ):
msgtype = msg.get_structure().get_name()
if msgtype != 'pocketsphinx':
return
print "hypothesis= '%s' confidence=%s\n" % (msg.get_structure().get_value('hypothesis'), msg.get_structure().get_value('confidence'))
pipeline = Gst.parse_launch('udpsrc port=3000 name=src caps=application/x-rtp ! rtppcmadepay name=rtpp ! alawdec name=decoder ! queue ! pocketsphinx name=asr ! fakesink')
asr = pipeline.get_by_name("asr")
asr.set_property("configured", "true")
bus = pipeline.get_bus()
bus.add_signal_watch()
bus.connect('message::element', element_message)
pipeline.set_state(Gst.State.PLAYING)
# enter into a mainloop
loop = GObject.MainLoop()
loop.run()
The sending side looks like:
import gobject, pygst
pygst.require("0.10")
import gst
pipeline = gst.parse_launch('alsasrc ! audioconvert ! audioresample ! alawenc ! rtppcmapay ! udpsink port=3000 host=192.168.13.120')
pipeline.set_state(gst.STATE_PLAYING)
loop = gobject.MainLoop()
loop.run()
This should receive a udp stream from the network, fead it into pocketsphinx and print the output to the terminal. If I replace the 'queue ! pocketsphinx ! fakesink' part by 'wavenc ! filesink', I do get a valid audio file with the correct content so I known the network-sending part is working correctly. (I do not have audio on my test machine so I cannot test with a local audiosource).
When I start the script I see the pocketspinx configuration passing by but then the script doesn't seem to do anything at all anymore. When I start the script with GST_DEBUG=*:4 I see the following output:
0:00:04.789157687 2220 0x86fff70 INFO GST_EVENT gstevent.c:760:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:00.000000000, duration 99:99:99.999999999
0:00:04.789616981 2220 0x86fff70 INFO basesrc gstbasesrc.c:2838:gst_base_src_loop:<src> marking pending DISCONT
0:00:04.789995780 2220 0x86fff70 INFO GST_EVENT gstevent.c:760:gst_event_new_segment: creating segment event time segment start=0:00:00.000000000, offset=0:00:00.000000000, stop=99:99:99.999999999, rate=1.000000, applied_rate=1.000000, flags=0x00, time=0:00:00.000000000, base=0:00:00.000000000, position 0:00:04.079311489, duration 99:99:99.999999999
0:00:04.790420834 2220 0x86fff70 INFO GST_EVENT gstevent.c:679:gst_event_new_caps: creating caps event audio/x-raw, format=(string)S16LE, layout=(string)interleaved, rate=(int)8000, channels=(int)1
0:00:04.790851965 2220 0x86fff70 WARN GST_PADS gstpad.c:3989:gst_pad_peer_query:<decoder:src> could not send sticky events
0:00:04.791258320 2220 0x86fff70 WARN basesrc gstbasesrc.c:2943:gst_base_src_loop:<src> error: Internal data flow error.
0:00:04.791572605 2220 0x86fff70 WARN basesrc gstbasesrc.c:2943:gst_base_src_loop:<src> error: streaming task paused, reason not-negotiated (-4)
0:00:04.791917073 2220 0x86fff70 INFO GST_ERROR_SYSTEM gstelement.c:1837:gst_element_message_full:<src> posting message: Internal data flow error.
0:00:04.792305347 2220 0x86fff70 INFO GST_ERROR_SYSTEM gstelement.c:1860:gst_element_message_full:<src> posted error message: Internal data flow error.
0:00:04.792633841 2220 0x86fff70 INFO task gsttask.c:315:gst_task_func:<src:src> Task going to paused
I do not understand what is going wrong based on the information and the examples I found Googling.
Any help would be highly appreciated.
Nico
Gstreamer element requires 16000 khz audio, you are trying to pass 8000. You'll have to modify pocketsphinx sources to enable 8000 in pocketsphinx element. You need to update the element spec rate, the samprate configuration parameter of pocketsphinx and the acoustic model.
Alternatively you need to send wideband audio over network. In that case you should not use alaw codec.