Search code examples
javajava-17

What does BUF_OFFSET field mean in BufferedInputStream?


This is the field.

private static final long BUF_OFFSET
        = U.objectFieldOffset(BufferedInputStream.class, "buf");

This is code which using BUF_OFFSET.

private void fill() throws IOException {
    byte[] buffer = getBufIfOpen();
    if (markpos < 0)
        pos = 0;            /* no mark: throw away the buffer */
    else if (pos >= buffer.length) { /* no room left in buffer */
        if (markpos > 0) {  /* can throw away early part of the buffer */
            int sz = pos - markpos;
            System.arraycopy(buffer, markpos, buffer, 0, sz);
            pos = sz;
            markpos = 0;
        } else if (buffer.length >= marklimit) {
            markpos = -1;   /* buffer got too big, invalidate mark */
            pos = 0;        /* drop buffer contents */
        } else {            /* grow buffer */
            int nsz = ArraysSupport.newLength(pos,
                    1,  /* minimum growth */
                    pos /* preferred growth */);
            if (nsz > marklimit)
                nsz = marklimit;
            byte[] nbuf = new byte[nsz];
            System.arraycopy(buffer, 0, nbuf, 0, pos);
            **if (!U.compareAndSetReference(this, BUF_OFFSET, buffer, nbuf)) {
                // Can't replace buf if there was an async close.
                // Note: This would need to be changed if fill()
                // is ever made accessible to multiple threads.
                // But for now, the only way CAS can fail is via close.
                // assert buf == null;
                throw new IOException("Stream closed");
            }**
            buffer = nbuf;
        }
    }
    count = pos;
    int n = getInIfOpen().read(buffer, pos, buffer.length - pos);
    if (n > 0)
        count = n + pos;
}

This is the code in JDK source about return the value of BUF_OFFSET.

static jlong find_field_offset(jclass clazz, jstring name, TRAPS) {
  assert(clazz != NULL, "clazz must not be NULL");
  assert(name != NULL, "name must not be NULL");

  ResourceMark rm(THREAD);
  char *utf_name = java_lang_String::as_utf8_string(JNIHandles::resolve_non_null(name));

  InstanceKlass* k = InstanceKlass::cast(java_lang_Class::as_Klass(JNIHandles::resolve_non_null(clazz)));

  jint offset = -1;
  for (JavaFieldStream fs(k); !fs.done(); fs.next()) {
    Symbol *name = fs.name();
    if (name->equals(utf_name)) {
      **offset** = fs.offset();
      break;
    }
  }
  if (offset < 0) {
    THROW_0(vmSymbols::java_lang_InternalError());
  }
  return field_offset_from_byte_offset(offset);
}

What does BUF_OFFSET field mean in BufferedInputStream?

I check the BUF_OFFSET out in JDK source code github:https://github.com/openjdk/jdk/tree/jdk-17%2B35 I have asked a question about what JavaFieldStream is in here.But I am still confused about the BUF_OFFSET.

I guess...Maybe BUF_OFFSET is simliar with the param off which is from FileInputStream's read method?

public int read(byte b[], int off, int len) throws IOException {
    return readBytes(b, off, len);
}

In here,off means where the data copys and fills in from C array to the Java array.So BUF_OFFSET means where data fills in JVM?It's just my guess.


Solution

  • The fill() method fills the internal buf member with more data replacing it by another array of a different size when necessary. Calls to fill() are protected by an internal lock (or synchronized when BufferedInputStream is sub-classed). When the InputStream gets closed buf is set to null to indicate that the stream is closed. However, calls to close() can be asynchronous in the sense that close() is not protected against concurrent access in the way that calls to fill() are. Hence, the need arises to check whether buf == null (stream is closed) in fill() and do if (buf == buffer) buf = null; (buf hasn't been changed concurrently by fill()) in close() in a threadsafe (atomic) manner.

    This is done by using atomic CAS (compare-and-swap / compare-and-set) instructions which compare the content of a memory location with a given value and, only if they are the same, modifies the content of that location to a given new value as a single atomic operation. Atomicity guarantees that the write is conditional on the current value being up-to-date (i.e., not being modified by another thread in the meantime).

    There are several ways to do this in Java: the AtomicXyzFieldUpdater classes in the j.u.concurrent.atomic package, the j.l.invoke VarHandle API introduced in Java 9 and a JDK-internal Unsafe API not intended for public use.

    Since BufferedInputStream is used early in the JVM bootstrap (that means the initialization phase at JVM startup) it is important to a) avoid using APIs that may not have been initialized at that time (wich can lead to dependency cycles) and b) to avoid slowing down the bootstrap by using APIs that have to be initialized first to be usable. That's the rationale for using the internal Unsafe API in this case.

    The Unsafe method we are talking about is the following:

    /**
     * Atomically updates Java variable to {@code x} if it is currently
     * holding {@code expected}.
     *
     * <p>This operation has memory semantics of a {@code volatile} read
     * and write.  Corresponds to C11 atomic_compare_exchange_strong.
     *
     * @return {@code true} if successful
     */
    @IntrinsicCandidate
    public final native boolean compareAndSetReference(Object o, long offset,
                                                       Object expected,
                                                       Object x);
    

    The first argument is the object instance whose field should be set, the second argument is the offset of the field to be updated in that object (in effect a relative memory address), the third argument is the value that we expect to be currently in that field and the fourth argument is the value that we want to be stored in that field if our expectation turns out to be correct.

    The offset of the field can be determined by using

    /**
     * Reports the location of the field with a given name in the storage
     * allocation of its class.
     *
     * @throws NullPointerException if any parameter is {@code null}.
     * @throws InternalError if there is no field named {@code name} declared
     *         in class {@code c}, i.e., if {@code c.getDeclaredField(name)}
     *         would throw {@code java.lang.NoSuchFieldException}.
     *
     * @see #objectFieldOffset(Field)
     */
    public long objectFieldOffset(Class<?> c, String name) {
        if (c == null || name == null) {
            throw new NullPointerException();
        }
    
        return objectFieldOffset1(c, name);
    }
    

    The field offset is a constant that needs to be determined only once and it is usually stored in a static final long. This is what BUF_OFFSET is (the offset of the buf field inside a BufferedInputStream instance).

    So, the (threadsafe) code

    if (!U.compareAndSetReference(this, BUF_OFFSET, buffer, nbuf)) {
        throw new IOException("Stream closed");
    }
    

    is logically equivalent to the (single-threaded) code

    if (buf == buffer) {
        buf = nbuf;
    } else {
        throw new IOException("Stream closed");
    }
    

    The only difference is that compareAndSetReference is atomic while the latter code is not.