I'm using construct 2.8 to reverse engineer the header of some files created by some long-lost Pascal program.
The header is made of a number of different records, some of which are optional, and I'm not sure whether the order is fixed or not.
For instance, two of the records look like this:
header_record_filetype = cs.Struct(
'record_type' / cs.Int8ub,
'file_type' / cs.PascalString(cs.Int16ub),
'unknown' / cs.Int8ub
)
header_record_user = cs.Struct(
'record_type' / cs.Int8ub,
'user' / cs.PascalString(cs.Int16ub)
)
And I've identified half a dozen more.
How would I go about getting the parser to choose the correct record type based on the record_type
member for an unknown number of records until it comes across a record with type 0 (or reaches the end of the file)?
I've solved it like this:
header = cs.Struct(
'record_type' / cs.Int8ub,
'record' / cs.Switch(cs.this.record_type, {header_record_type_0x01: header_record_0x01,
header_record_type_filename: header_record_filename,
header_record_type_filetype: header_record_filetype,
header_record_type_user: header_record_user,
header_record_type_end: header_record_end,
header_record_type_image_metadata: header_record_image_metadata},
default=header_record_end
),
'offset' / cs.Tell
)
with open(sys.argv[1], 'rb') as f:
h = f.read(2048)
index = 0
record_type = h[index]
while record_type != 0:
record = header.parse(h[index:])
print(record)
index += record.offset
record_type = record.record_type
But I don't know if that is the best* way of doing it.
*For some value of "best".
I found the RepeatUntil() construct hiding at the bottom of a help page. So now I have this:
header = cs.Struct(
'type' / cs.Enum(cs.Int8ub,
file_metadata=0x01,
filename=0x02,
file_type=0x03,
user=0x0A,
image_metadata=0x10,
end=0xFF),
'record' / cs.Switch(cs.this.type, {'file_metadata': header_record_file_metadata,
'filename': header_record_filename,
'file_type': header_record_filetype,
'user': header_record_user,
'end': header_record_end,
'image_metadata': header_record_image_metadata}),
'size' / cs.Tell
)
with open(sys.argv[1], 'rb') as f:
h = f.read(2048)
records = cs.RepeatUntil(lambda obj, lst, ctx: obj.type == 'end', header).parse(h)
print(records)
Which feels a lot cleaner and more in keeping with the declarative nature of construct.