This is bit of a long winded question so stick with me...
I'm trying to read a binary file that has the layout of a database with rows of records. There is a two byte integer that indicates the start of a new row/record.
What I know about the file is:
It has the following labels with their maximum length associated with it
NAME_LENGTH = 32;
LOCATION_LENGTH = 64;
SPECIALTIES_LENGTH = 64;
SIZE_LENGTH = 6;
RATE_LENGTH = 8;
OWNER_LENGTH = 8;
There is also a header which occupies the first 72 bytes (identified this with a hex editor but we skip this, so I start reading at byte 72).
With this in mind it was my limited understanding that I could read the total number of bytes each value can contain into a variable. If for example the name was "John" the first four bytes would represent the name John and the remainder would be white space. I then assumed I could go on to read the next chunk of bytes getting the next value.
I put together a method to do just that
private Contract retrieveContract(long locationInFile) throws IOException {
final byte[] input = new byte[Contract.RECORD_LENGTH];
synchronized (database) {
database.seek(locationInFile);
database.readFully(input);
}
class RecordFieldReader {
private int offset = 0;
String read(int length) throws UnsupportedEncodingException {
String str = new String(input, offset, length, "UTF-8");
offset += length;
return str.trim();
}
}
RecordFieldReader readRecord = new RecordFieldReader();
String name = readRecord.read(Contract.NAME_LENGTH);
String location = readRecord.read(Contract.LOCATION_LENGTH);
String specialties = readRecord.read(Contract.SPECIALTIES_LENGTH);
String size = readRecord.read(Contract.SIZE_LENGTH);
String rate = readRecord.read(Contract.RATE_LENGTH);
String owner = readRecord.read(Contract.OWNER_LENGTH);
return "DELETED".equals(name) ? null : new Contract(name, location, specialties, size, rate, owner);
}
With the values of Contract.x being as described below in a separate class (The overall goal here is to read each record into its own object)
static final int NAME_LENGTH = 32;
static final int LOCATION_LENGTH = 64;
static final int SPECIALTIES_LENGTH = 64;
static final int SIZE_LENGTH = 6;
static final int RATE_LENGTH = 8;
static final int OWNER_LENGTH = 8;
static final int RECORD_LENGTH = NAME_LENGTH
+ LOCATION_LENGTH
+ SPECIALTIES_LENGTH
+ SIZE_LENGTH
+ RATE_LENGTH
+ OWNER_LENGTH;
The out come of the above is the first few records are aligned how they should be, but then they seem to go all over the place
Output here as a gist (Its a bit long sorry)
and finally the system crashes with
java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:421)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399)
at suncertify.db.ContractFileAccess.retrieveContract(ContractFileAccess.java:99)
at suncertify.db.ContractFileAccess.getContractList(ContractFileAccess.java:63)
at suncertify.db.ContractFileAccess.<init>(ContractFileAccess.java:45)
at suncertify.db.Main.main(Main.java:17)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Finally the question comes: If I'm continually reading the max value of each field and outputting it then moving to the next block to read, why is the output getting jumbled up the further down the file it goes?
The original input is another gist here for completeness.
Sometimes putting it out in a public space gets you thinking about it in a different way. The problem was the I wasn't taking into account the two byte separator. This was simply solved by adding +2 to the record length.
static final int RECORD_LENGTH = 2 + NAME_LENGTH
+ LOCATION_LENGTH
+ SPECIALTIES_LENGTH
+ SIZE_LENGTH
+ RATE_LENGTH
+ OWNER_LENGTH;
Thank you StackOverflow for being a place to rant and solve ;)