Search code examples
javasocketssftpsshj

timeout at net.schmizz.sshj.transport.KeyExchanger.waitForDone(KeyExchanger.java:160)


I have a nightly job that is using net.schmizz.sshj SFTP library.

I suspect that there was a change on the server that we SFTP into. However, I cannot pinpoint what the exact change is. Nor can I reproduce it outside the production server. So I am looking for insight of a root cause or how to reproduce the following error

net.schmizz.sshj.transport.TransportException: Timeout expired at net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:33) at net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:27) at net.schmizz.concurrent.Promise.retrieve(Promise.java:139) at net.schmizz.concurrent.Event.await(Event.java:103) at net.schmizz.sshj.transport.KeyExchanger.waitForDone(KeyExchanger.java:160) at net.schmizz.sshj.transport.KeyExchanger.startKex(KeyExchanger.java:155) at net.schmizz.sshj.transport.TransportImpl.doKex(TransportImpl.java:247) at net.schmizz.sshj.SSHClient.doKex(SSHClient.java:735) at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:723) at net.schmizz.sshj.SocketClient.connect(SocketClient.java:54) at net.schmizz.sshj.SocketClient.connect(SocketClient.java:80)

From what I can tell, we're connecting via a java.net.Socket ok and then timing out waiting for the key exchange to complete in this code

package net.schmizz.sshj.transport;

/** Algorithm negotiation and key exchange. */
final class KeyExchanger
        implements SSHPacketHandler, ErrorNotifiable {

    /**
     * Starts key exchange by sending a {@code SSH_MSG_KEXINIT} packet. Key exchange needs to be done once mandatorily
     * after initializing the {@link Transport} for it to be usable and may be initiated at any later point e.g. if
     * {@link Transport#getConfig() algorithms} have changed and should be renegotiated.
     *
     * @param waitForDone whether should block till key exchange completed
     *
     * @throws TransportException if there is an error during key exchange
     * @see {@link Transport#setTimeoutMs} for setting timeout for kex
     */
    void startKex(boolean waitForDone)
            throws TransportException {
        if (!kexOngoing.getAndSet(true)) {
            done.clear();
            sendKexInit();
        }
        if (waitForDone)
            waitForDone();
    }

    /**
     * Sends SSH_MSG_KEXINIT and sets the {@link #kexInitSent} event.
     *
     * @throws TransportException
     */
    private void sendKexInit()
            throws TransportException {
        log.debug("Sending SSH_MSG_KEXINIT");
        clientProposal = new Proposal(transport.getConfig());
        transport.write(clientProposal.getPacket());
        kexInitSent.set();
    }

void waitForDone()
        throws TransportException {
    done.await(transport.getTimeoutMs(), TimeUnit.MILLISECONDS);
}

Solution

  • After much debugging after posting this question I figured out that the problem was that we don't have the same encryption libraries running in the production environments as I did in my local testing. I determined this by building the minimal jar that I could while still retaining all the SSHJ pieces used in production and all the relevant code of our own and then running it with DEBUG level logging in a command shell where I could see all output. Then I saw the following and I realized that the timeout exception in our production server logs is a second exception occurring after and masking the root cause which is not being logged in production: a NoClassDefFoundError

    Exception in thread "reader" java.lang.NoClassDefFoundError: org/bouncycastle/crypto/ec/CustomNamedCurves
        at net.schmizz.sshj.transport.kex.Curve25519DH.getCurve25519Params(Curve25519DH.java:51)
        at net.schmizz.sshj.transport.kex.Curve25519SHA256.initDH(Curve25519SHA256.java:36)
        at net.schmizz.sshj.transport.kex.AbstractDHG.init(AbstractDHG.java:55)
        at net.schmizz.sshj.transport.KeyExchanger.gotKexInit(KeyExchanger.java:239)
        at net.schmizz.sshj.transport.KeyExchanger.handle(KeyExchanger.java:359)
        at net.schmizz.sshj.transport.TransportImpl.handle(TransportImpl.java:493)
        at net.schmizz.sshj.transport.Decoder.decode(Decoder.java:107)
        at net.schmizz.sshj.transport.Decoder.received(Decoder.java:175)
        at net.schmizz.sshj.transport.Reader.run(Reader.java:60)
    Caused by: java.lang.ClassNotFoundException: org.bouncycastle.crypto.ec.CustomNamedCurves
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 9 more
    
    net.schmizz.sshj.transport.TransportException: Timeout expired
        at net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:33)
        at net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:27)
        at net.schmizz.concurrent.Promise.retrieve(Promise.java:139)
        at net.schmizz.concurrent.Event.await(Event.java:103)
        at net.schmizz.sshj.transport.KeyExchanger.waitForDone(KeyExchanger.java:160)
        at net.schmizz.sshj.transport.KeyExchanger.startKex(KeyExchanger.java:155)
        at net.schmizz.sshj.transport.TransportImpl.doKex(TransportImpl.java:247)
        at net.schmizz.sshj.SSHClient.doKex(SSHClient.java:735)
        at net.schmizz.sshj.SSHClient.onConnect(SSHClient.java:723)
        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:54)
        at net.schmizz.sshj.SocketClient.connect(SocketClient.java:80)
        at kirby.App.connect(App.java:101)
        at kirby.App.execute(App.java:45)
        at kirby.App.main(App.java:31)
    Caused by: java.util.concurrent.TimeoutException: Timeout expired
        ... 12 more
    

    The difference comes down to classpath. In production we had

    bcpg-jdk15on-150.jar:bcpkix-jdk15on-150.jar:bcprov-jdk15on-150.jar

    but need

    bcpkix-jdk15on/1.51/bcpkix-jdk15on-1.51.jar:ecc-25519-java-1.0.1.jar:bcprov-jdk15on/1.51/bcprov-jdk15on-1.51.jar