Search code examples
google-cloud-platformgcloudgoogle-cloud-speechsox

Setting up google speech to text on google cloud - Error: spawn SoX ENOENT


I've followed a bunch of tutorials on Google's speech to text, and have it all working fine locally. My setup is using websockets (socket.io) to communicate between a client Angular app and a node/express backend that does the server-side call to the speech API. I am using streaming-recognize ( https://cloud.google.com/speech-to-text/docs/streaming-recognize ) to read mic stream and return results.

This works fully locally, but I have an issue when running it on the gcloud app deploy instance in that I haven't actually installed the SoX dependency (done locally via brew install sox . This is a requirement for their example of setting up the mic stream.

I think I need to set up a virtual machine instance which I can provision with SoX, but also feel this seems a bit overkill - is there an alternative? I did try to manually parse and send the mic data stream as Uint8Array/ArrayBuffer chunkns with some success, but not much. I also read some hypothesis about non-SoX approaches to processing user mic stream, but to no avail. Eg with recordrtc.

The question is - what do I need to do to get this working in gcloud? Set up a vm instance, install sox, and use that? Or is there a SoX-free way to get this running? Guidance welcomed!

Here's the server error I get on gcloud - it seems to me because it has not got SoX on its path:

Error: spawn sox ENOENT      at Process.ChildProcess._handle.onexit (internal/child_process.js:267:19)      at onErrorNT (internal/child_process.js:469:16)      at processTicksAndRejections (internal/process/task_queues.js:84:21)
  Emitted 'error' event on ChildProcess instance at:
      at Process.ChildProcess._handle.onexit (internal/child_process.js:273:12)
      at onErrorNT (internal/child_process.js:469:16)
      at processTicksAndRejections (internal/process/task_queues.js:84:21) {
    errno: 'ENOENT',
    code: 'ENOENT',
    syscall: 'spawn sox',
    path: 'sox',
    spawnargs: [
      '--default-device',
      '--no-show-progress',
      '--rate',
      16000,
      '--channels',
      1,
      '--encoding',
      'signed-integer',
      '--bits',
      '16',
      '--type',
      'wav',
      '-'
    ]
  }
  npm ERR! code ELIFECYCLE
  npm ERR! errno 1
  npm ERR! <APPNAME>@0.0.0 start:prod: `node server.js;`
  npm ERR! Exit status 1
  npm ERR!
  npm ERR! Failed at the <APPNAME>@0.0.0 start:prod script.
  npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

  npm ERR! A complete log of this run can be found in:
  npm ERR!     /root/.npm/_logs/2020-11-30T22_12_35_041Z-debug.log
  npm ERR! code ELIFECYCLE
  npm ERR! errno 1
  npm ERR! <APPNAME>@0.0.0 start: `npm run start:prod`
  npm ERR! Exit status 1
  npm ERR!
  npm ERR! Failed at the <APPNAME>@0.0.0 start script.
  npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

  npm ERR! A complete log of this run can be found in:
  npm ERR!     /root/.npm/_logs/2020-11-30T22_12_35_074Z-debug.log

Solution

  • You are trying to set up Google Speech to Text and you want to deploy it on Google App Engine ( gcloud app deploy ).

    However Google Speech to Text has a Sox dependency and this required the Sox CLI to be installed in the Operating System.

    So you will need to use App Engine Flexible environment with a Custom Runtime. In the Dockerfile, you can specify to install the SOX CLI.

    I was able to successfully deploy an App Engine Flex app that consumes Speech to Text API using the steps provided in the quickstart and the sample code from the nodejs-speech repository. Please have a look into it.

    ************** UPDATE **************

    Dockerfile:

    FROM gcr.io/google-appengine/nodejs
    
    # Working directory is where files are stored, npm is installed, and the application is launched
    WORKDIR /app
    
    # Copy application to the /app directory.
    # Add only the package.json before running 'npm install' so 'npm install' is not run if there are only code changes, no package changes
    COPY package.json /app/package.json
    RUN apt-get update
    RUN  apt-get install -y sox
    RUN npm install
    COPY . /app
    
    # Expose port so when the container is launched you can curl/see it.
    EXPOSE 8080
    
    # The command to execute when Docker image launches.
    CMD ["npm", "start"]
    

    Please try the above Dockerfile and adapt it to your needs, it's an example of how to install sox.