Search code examples

Jobs finishing successfully even though IOException occurs

I receive various IOException on my master node when running the GridMix and I wonder if this is something I should be really concerned about or is it something transient as my jobs are finishing successfully:

IOException: Bad connect ack with firstBadLink: \ Bad response ERROR for block BP-49483579- from datanode
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$


  • I cannot be sure until I understand your complete setup but high possibility is that these exceptions are occurring while appending to pipeline setup, in terms of code you can say that stage == BlockConstructionStage.PIPELINE_SETUP_APPEND.

    In any case since your jobs are getting successfully finished you need not to worry, and why it is getting successfully finished is because when trying to open a DataOutputStream to a DataNode pipeline and some exception occurs then it keeps on trying until a pipeline is setup.

    The exception occurs from org.apache.hadoop.hdfs.DFSOutputStream, and below are important code snippets for your understanding.

     private boolean createBlockOutputStream(DatanodeInfo[] nodes, long newGS, boolean recoveryFlag) {
        if (pipelineStatus != SUCCESS) {
          if (pipelineStatus == Status.ERROR_ACCESS_TOKEN) {
            throw new InvalidBlockTokenException(
                "Got access token error for connect ack with firstBadLink as "
                    + firstBadLink);
          } else {
            throw new IOException("Bad connect ack with firstBadLink as "
                + firstBadLink);

    Now, createBlockOutputStream is called from setupPipelineForAppendOrRecovery, and as the code comment for this method mentions - "It keeps on trying until a pipeline is setup".

     * Open a DataOutputStream to a DataNode pipeline so that 
     * it can be written to.
     * This happens when a file is appended or data streaming fails
     * It keeps on trying until a pipeline is setup
    private boolean setupPipelineForAppendOrRecovery() throws IOException {
        while (!success && !streamerClosed && dfsClient.clientRunning) {
            success = createBlockOutputStream(nodes, newGS, isRecovery);

    And if you will go through the complete org.apache.hadoop.hdfs.DFSOutputStream code you will understand that pipeline setup trial will keep on going until a pipeline is created for append or fresh use.

    If you want to handle it then you can try to adjust dfs.datanode.max.xcievers property from hdfs-site.xml, maximum people have reported solution from the same. Please note that you need to restart your hadoop services after you set the property.
