Search code examples
jpegniofileinputstreambytebufferfilechannel

Read JPEG magic number with FileChannel and ByteBuffer


I started digging into Java NIO API and as a first try I wanted to read a JPEG file magic number.

Here's the code

import java.nio.ByteBuffer;
import java.nio.CharBuffer;

import java.nio.channels.FileChannel;

import java.nio.charset.Charset;

import java.io.FileInputStream;

public class JpegMagicNumber {
    public static void main(String[] args) throws Exception {
        FileChannel file = new FileInputStream(args[0]).getChannel();
        ByteBuffer buffer = ByteBuffer.allocate(6);
        file.read(buffer);
        buffer.flip();
 System.out.println(Charset.defaultCharset().decode(buffer).toString());
file.close();
buffer.clear(); 
    }
}

I expect to get the magic number chars back but all I get is garbage data into the terminal.

Am I doing something wrong ?


Solution

  • Short answer: There is nothing particularly defective with the code. JPEG just has 'garbage' up front.

    Long answer: JPEG internally is made up of segments, one after the other. These segments start with a 0xFF byte, followed by an identifier byte, and then an optional payload/content.

    Example start:

    FF D8 FF E0 00 10 4A 46 49 46 00 01 01 00 00 01 00 01 00 00 FF E1
    

    The image starts with the Start Of Image (SOI) segment, 0xFF 0xD8, which has no payload.

    The next segment is 'application specific', 0xFF 0xE0. Two bytes follow with the length of the payload (these two bytes included!).

    0x4A 0x46 0x49 0x46 : JFIF ← perhaps what you were looking for?

    JPEG doesn't have a magic number in the sense you were perhaps looking for, like 'PK' for zip or '‰PNG' for PNG. (The closest thing is 0xFF 0xD8 0xFF for the SOI and the first byte of the next segment.)

    So your code does correctly read the first six bytes of the file, decodes them into characters per your native platform, and prints them out, but a JPEG header just looks that way.