Search code examples
hadoopbinaryhdfsnio

write binary files using hadoop filesystem org.apache.hadoop.fs.FileSystem


I used to generate binary file using a MappedByteBuffer and FileChannel like that:

  final Path fullPath = ..
  final File parent = fullPath.toFile();
  FileChannel out=null;
  out = new RandomAccessFile(new File(parent, "myFileName"), "rw").getChannel();
  MappedByteBuffer buffer=null;
  buffer = out.map(FileChannel.MapMode.READ_WRITE, 0, size);
  buffer.order(ByteOrder.LITTLE_ENDIAN);
  buffer.putInt(12);
  buffer.putFloat(25);
  buffer.putFloat(32);

I tried to do generate the same binary file using hadoop api like this:

  FileSystem hadoopFileSys= hadoopDataSource.getFileSystem();

   FSDataOutputStream outputStream=null;
   outputStream = hadoopFileSys.create(hadoopPath);
   outputStream.writeInt(12);
   outputStream.writeFloat(25);
   outputStream.writeFloat(32);

My problem is that the generated file using hadoop api is not identique to the file generated using nio api.

I need that the two genrated files be the same because they will parsed by the same tools.


Solution

  • Hadoop FS uses big endian. So you really need to use:

        buffer.order(ByteOrder.BIG_ENDIAN);
    

    nio:

    00 00 00 0C 41 C8 00 00 42 00 00 00
    

    hadoop:

    00 00 00 0C 41 C8 00 00 42 00 00 00