Search code examples
javaspringzeromqdistributed-computing

Is there any ZeroMqChannel channel timeout?


I have a ZeroMqChannel object that I use to retrieve data. Every week service going offline, due to a maintenance, and ZeroMqChannel instance keeps connection and doesn't even throw any exceptions.

Is there any way to throw a timeout error and try to reconnect later?

    @Bean(name = "zeroMqChannel")
    public ZeroMqChannel zeroMqPubSubChannel(ZContext context) {

        ZeroMqChannel channel = new ZeroMqChannel(context, true);
        channel.setConnectUrl(url);
        return channel;
    }

Solution

  • Having my hands dirty with ZeroMQ-API since v2.1 ( 200?... - the time is flying so fast ),
    I dare claim,
    the problem is in the re-wrapping (with adding design compromises) of an already re-wrapped (with cruel(*) design omissions) ZeroMQ native API.

    JeroMQ documentation ( as of 2022-Q1 based on libzmq 4.1.7 ) says :

    TCP KeepAlive Count, Idle, Interval cannot be set via Java but as OS level.


    WORKAROUND ?

    In native-API designs, we may easily try to programmatically .bind() / .connect() another temporal-"link" between persistent (one indeed, the other becoming switched off each week), and already deemed by a code-designer to be still inter-linked ZeroMQ AccessPoints, so as to try if the remote service AccessPoint is actually still indeed alive or if it had gone down, for the said maintenance ( without sending a proper signal to its counterpart. So rude & unfair practice among cooperating agents', isn't it? Fair & smart systems indeed cooperate, even on planned re-configuration, before disconnecting from peer(s) - this is why ZeroMQ messaging + signalling meta-plane designs are so cute to make distributed-computing robust & smart in production, isn't it? I dare to imagine, how many distributed-FSA un-salvageable deadlocks happen for such rude-mode in REQ/REP cases ).

    Details matter & depend on the re-wrapping of the re-wrapped native ZeroMQ API. There we can have one PUB/SUB-link operating as many "co-parallel"-links, setup & torn-down on-the-fly (using a failure of an attempt to setup a new one as a self-diagnosing tool for detecting remote AccessNode unable to get connected to and make serve this side-effect to our advantage).

    All this depends on the actual design of the wrappers. The native API provides even socket-monitor, that could solve part of these issues in a clear & sound manner, yet not sure, if that got implemented in JeroMQ & Spring-integration wrappers.

    Documented re-wrapping of the ZeroMQ Zen-of-Zero into highly abstracted, yet internally super-inconsistent, "adaptor" details confuse me a warn me to be very careful of what else we loose from the smart original if using it :

    The ZeroMqChannel is a SubscribableChannel which uses a pair of ZeroMQ sockets to connect publishers and subscribers for messaging interaction. It can work in a PUB/SUB mode (defaults to PUSH/PULL); it can also be used as a local inter-thread channel (uses PAIR sockets)

    *)

    • no ipc:// transport-class (emulated over tcp://loopback-port)
    • no pgm:// transport-class
    • no norm:// transport-class
    • no tipc:// transport-class
    • ...
    • PUB/SUB filtering, based on TOPIC, may be, yet need not be so, re-configured by the sender into a multi-frame message ( i.e. without a knowledge or a participation on control thereof by any of the, potential many, SUB-side(s) )
    • "chameleon"-alike typing of the (re-wrapped) smart ZeroMQ Scalable Formal Communication Pattern archetypes, now original PUB/SUB becoming suddenly PUSH/PULL, with all changed "behavioural"-patterns. This is hard for me to consider a positive for doing system designs ( creeping properties IMHO serve no benefits - new users get lost as never see the native-API original smartness and powers, experienced users get panic as all the smart features of the native-API do not work as-is, but got re-dressed into mixture of originally super-clean, performance optimised concepts, hybrid { PUSH/PULL | PUB/SUB }-mutable-mezzo-types are error prone in re-use in otherwise classical & stable use-case patterns, causing further concept misses & hidden design-compromises are just a few losses caused by such hybrid & liquid landscape )
    • and many more