I'm processing a binary stream and need to skip efficiently past a range of data that I'm not interested in, to some data that will be processed.
InputStream.skip(long)
doesn't make much in the way of guarantees:
Skips over and discards n bytes of data from this input stream. The skip method may, for a variety of reasons, end up skipping over some smaller number of bytes, possibly 0. This may result from any of a number of conditions; reaching end of file before n bytes have been skipped is only one possibility. The actual number of bytes skipped is returned.
I need to know that one of two things has happened:
Simple enough. However, the leniency afforded in this description means that, for example, BufferedInputStream
can just skip a few bytes and return. Sure, it tells me that it's skipped just those few, but it's not clear why.
So my question is: can you make use of InputStream.skip(long)
in such a way as that you know when either the stream ends or the skip completes successfully?
I don't think we can get a really robust implementation because the skip()
method contract is rather bizarre. For one thing, the behaviour at EOF
is not well defined. If I want to skip 8 bytes and is.skip(8)
returns 0
, it's not trivial to decide if I should try again, there is a danger of an infinite loop if some implementation chooses to return 0
at EOF
. And available()
is not to be trusted, either.
Hence, I propose the following:
/**
* Skips n bytes. Best effort.
*/
public static void myskip(InputStream is, long n) throws IOException {
while(n > 0) {
long n1 = is.skip(n);
if( n1 > 0 ) {
n -= n1;
} else if( n1 == 0 ) { // should we retry? lets read one byte
if( is.read() == -1) // EOF
break;
else
n--;
} else // negative? this should never happen but...
throw new IOException("skip() returned a negative value. This should never happen");
}
}
Shouldn't we return a value to inform the number of bytes "really skipped"? Or a boolean to inform that EOF was reached? We cannot do that in a robust way. For example, if we call skip(8)
for a FileInputStream
object, it will return 8 even if we are at EOF
, or if the file has only 2 bytes. But the method is robust in the sense that it does what we want to: skip n
bytes (if possible) and let me continue processing it (if my next read returns -1
I'll know that EOF
was reached).