Search code examples
azureazure-iot-edgeazure-iot-sdk

Azure edge hub message expiry not triggering callback


I have a scenario where I need to display a message if a module is unresponsive for more than a few seconds.

To do this, I call sendEventAsync where I construct a message with an expiry time of 2000ms:

module.sendMessageToTopic(ConnectionStatusRequest.newBuilder().withExpiryTime(2000).build(), (responseStatus, callbackContext) -> {
    if (IotHubStatusCode.MESSAGE_EXPIRED.equals(responseStatus)) {
        LOGGER.warn("Could not retrieve connection status before TTL, considering module as offline.");
        broadcastConnectionStatus(IotHubConnectionStatus.DISCONNECTED);
    } else {
        LOGGER.info("Received status callback for connection status: {}", responseStatus);
    }
});

The edge deployment schema has a TTL of 90 seconds for all routes except upstream, and I want this specific message to exist for 2 seconds. If I have not received a response on this topic within 2 seconds, I consider the message expired and the module in question is considered offline. I've also tried setting a route specific TTL of 2 seconds with no expiry time set on the message, but I get the same result.

However, when I try to force the issue by manually killing the container that listens on this topic, and it takes > 2 seconds for it to start again, I never get a callback with status MESSAGE_EXPIRED:

2021-07-19 09:25:42,683 [WebSocketWorker-17] INFO  s.i.l.w.service.WebSocketService.broadcastConnectionStatus(470) - Module is online, requesting additional status information from other modules.
2021-07-19 09:25:42,740 [azure-iot-sdk-IotHubSendTask] INFO  s.i.l.w.service.WebSocketService.lambda$broadcastConnectionStatus$32(476) - Received status callback for connection status: OK_EMPTY

When I dig through the source code of the module logic for Java, it seems like it should poll for messages to send and receive every 10ms. It then invokes whatever callback was provided with that message. A message is considered expired when System.currentTimeMillis() is greater than the provided expiry time. That should set the packet status to MESSAGE_EXPIRED and add it to a callback queue that is invoked on an executor schedule.

So what I would expect is:

  1. OK_EMPTY when Edge Hub has received the message
  2. MESSAGE_EXPIRED on the same callback when Edge Hub fails to get an ack within 2 seconds on the topics consumer

Am I misunderstanding how these callbacks work?

How does the programmatic setter for expiry time on the Message object relate to the TTL of the edge hub routes?

Where can I listen to the MESSAGE_EXPIRED event to achieve my desired behavior?

Update:

I tried turning expiry time down to 0, and that immediately triggers a MESSAGE_EXPIRED even when the module I sent to is online. I've tried trimming the time down from 500 ms to 100 ms in 100 ms increments, but it doesn't help; I still don't get the MESSAGE_EXPIRED message.


Solution

  • When using setExpiryTime method on a message that is sent to the edge hub, that expiry is only valid if the sender fails to deliver to message to the edge hub.

    Once the message is delivered to the edge hub, the SDKs job is completed and it is a "fire and forget".

    Once on the edge hub, you can use timeToLiveSecs in the routes section. However that doesn't cause any callback either, it simply tells the edge hub to discard messages that fail to be delivered within the time limit.