Search code examples
google-cloud-platformwebsocketmqttdisconnectionaedes

Periodic client disconnections in MQTT NodeJS server on websockets when deployed to Google Cloud Run


I have a Node JS MQTT server running on websockets and deployed to Cloud Run. This server uses Aedes MQTT Broker library (https://github.com/moscajs/aedes). Clients connect to this server on port 443. Everything works ok, but periodically each client is disconnected from the MQTT Broker (aprox. every 5 minutes since each client first connected) This does not happen when the server is deployed to a dedicated machine or a Google Cloud VM, only happens when deployed to Google Cloud Run.

To illustrate the problem, I have written in NodeJS a super simple MQTT server, which allows any client to connect and echoes to queue "responses" any message that the client sends to queue "messages" and a NodeJS script to test it. You may find the full 2-files project in https://github.com/madomingo/mqtt_test/tree/main

The server code is:

const aedes = require('aedes')()
const ws = require('websocket-stream')
const port = 8882
const WEB_SOCKETS = true
let httpServer 

if (WEB_SOCKETS) {
    httpServer = require('http').createServer()
    ws.createServer({ server: httpServer }, aedes.handle)
} else {
    httpServer = require('net').createServer(aedes.handle)
}

httpServer.listen(port, function () {
    let type = (WEB_SOCKETS) ? "WebSockets" : "HTTP"
  console.log('STARTED: ' + type + ' server listening on port ', port)
})


aedes.subscribe('messages', function (packet, callback) {
    let msg = packet.payload.toString()
    console.log('Received message', msg);
    aedes.publish(
        {            
            topic: 'responses',
            payload: msg,
            retain: false
        }
    );

});

I have also a NodeJS test script that connects to the server and sends the datetime to a queue every 15 seconds:

let mqtt = require('mqtt');

// Web sockets host
const host = "ws://localhost:8882"

let clientId = 'test web client ' + Date.now
let client = mqtt.connect(host, {    
    clientId: clientId
  });

client.on('connect', function () {
  console.log("Connected to server")
  client.subscribe('responses');
  sendMessage()
  
  setInterval(() => {
      sendMessage()
  }, 15000);
 

})

client.on('message', function (topic, message) {
  console.log('Received a message in topic: ' + topic + ": " + message.toString());
});

client.on('error', function(e) {
  console.log('Received an error: ' + e.message)
})

client.on('disconnect', function() {
  console.log('Client was disconnected ' )
})
client.on('reconnect', function() {
  console.log('Client was reconnected ')
})

client.on('close', function() {
  console.log('Client was closed ' )
})

client.on('offline', function() {
  console.log('Client was offline ')
})

function sendMessage() {
  let now = new Date().toISOString()
  let msg = "Current time = " + now
  console.log("Publishing data", msg)
  client.publish('messages', msg);
}

If I run the server and the test script locally, everything works as expected: Every 15 secs', the client sends the date time to the server in the queue "messages", then the server immediately echoes this datetime to queue "responses". Both server and client log to the console. This runs indefinitely and without disconnections.

The problem I find is when the server is deployed to a Google Cloud Run service (I already did this and I got the url: test-mqtt-server-kdtisjwi5a-ew.a.run.app).

In the test script, I change the host url to: const host = "wss://test-mqtt-server-kdtisjwi5a-ew.a.run.app:443". Then run again the test script, the client connects to the server deployed to Cloud Run and, for a while, the console shows the logs as expected:

Connected to server
Publishing data Current time = 2021-09-16T16:06:26.937Z
Received a message in topic: responses: Current time = 2021-09-16T16:06:26.937Z
Publishing data Current time = 2021-09-16T16:06:41.944Z
Received a message in topic: responses: Current time = 2021-09-16T16:06:41.944Z

But every 5 minutes after the first connection, the client is unexpectedly disconnected, although it immediately reconnects. The log now shows:

Received a message in topic: responses: Current time = 2021-09-16T16:11:11.975Z

Client was offline
Client was closed

Client was reconnected
Connected to server
Publishing data Current time = 2021-09-16T16:11:28.121Z
Received a message in topic: responses: Current time = 2021-09-16T16:11:28.121Z

I did not expect that disconnection, since apparently there is not any connection problem. So my question is, why this disconnection occurs when deployed to Cloud Run? And can it be avoided?

Any help appreciated, thanks!!


Solution

  • As per the comments the docs say:

    WebSockets requests are treated as long-running HTTP requests in Cloud Run. They are subject to request timeouts (currently up to 60 minutes and defaults to 5 minutes) even if your application server does not enforce any timeouts.

    Accordingly, if the client keeps the connection open longer than the required timeout configured for the Cloud Run service, the client will be disconnected when the request times out.

    Therefore, WebSockets clients connecting to Cloud Run should handle reconnecting to the server if the request times out or the server disconnects. You can achieve this in browser-based clients by using libraries such as reconnecting-websocket or by handling "disconnect" events if you are using the SocketIO library.

    This will apply to MQTT over WebSockets so changing the timeout will allow longer connections but you should also ensure that the client is automatically reconnecting when the connection does go down.