Search code examples
javastreaminputstreamjava-ioskip

Robust skipping of data in a java.io.InputStream and its subtypes


I'm processing a binary stream and need to skip efficiently past a range of data that I'm not interested in, to some data that will be processed.

InputStream.skip(long) doesn't make much in the way of guarantees:

Skips over and discards n bytes of data from this input stream. The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0. This may result from any of a number of conditions; reaching end of file before n bytes have been skipped is only one possibility. The actual number of bytes skipped is returned.

I need to know that one of two things has happened:

  1. The stream ended
  2. The bytes were skipped

Simple enough. However, the leniency afforded in this description means that, for example, BufferedInputStream can just skip a few bytes and return. Sure, it tells me that it's skipped just those few, but it's not clear why.

So my question is: can you make use of InputStream.skip(long) in such a way as that you know when either the stream ends or the skip completes successfully?


Solution

  • I don't think we can get a really robust implementation because the skip() method contract is rather bizarre. For one thing, the behaviour at EOF is not well defined. If I want to skip 8 bytes and is.skip(8) returns 0, it's not trivial to decide if I should try again, there is a danger of an infinite loop if some implementation chooses to return 0 at EOF. And available() is not to be trusted, either.

    Hence, I propose the following:

    /**
     * Skips n bytes. Best effort.
     */
    public static void myskip(InputStream is, long n) throws IOException {
        while(n > 0) {
            long n1 = is.skip(n);
            if( n1 > 0 ) {
                n -= n1;
            } else if( n1 == 0 ) { // should we retry? lets read one byte
                if( is.read() == -1)  // EOF
                    break;
                else 
                    n--;
            } else // negative? this should never happen but...
            throw new IOException("skip() returned a negative value. This should never happen");
        }
    }
    

    Shouldn't we return a value to inform the number of bytes "really skipped"? Or a boolean to inform that EOF was reached? We cannot do that in a robust way. For example, if we call skip(8) for a FileInputStream object, it will return 8 even if we are at EOF, or if the file has only 2 bytes. But the method is robust in the sense that it does what we want to: skip n bytes (if possible) and let me continue processing it (if my next read returns -1 I'll know that EOF was reached).