Search code examples
mysqlinnodbfile-formatraw-data

Dissecting MySQL InnoDB record format to restore from raw disk


I had a mysql database stored on a USB thumb drive which has irreparably lost its file allocation table. Therefore, I cannot get to the ibdata1 file as a whole. I can, however locate the record pages which were used using a hex editor.

All the data is there, but I have to read each record myself and play back new SQL statements to a database restored from a 6 month old backup.

Because I have a backup, I know the table structure. and can find a record in the new database that I know roughly equates to a small block of binary data. However, I am having trouble determining exactly where the record starts and decoding the record data.

The CREATE statement for the table is: CREATE TABLE ExpenseTransactions ( idExpenseTransactions int(11) NOT NULL AUTO_INCREMENT, TransactionDate datetime NOT NULL, DollarAmount float DEFAULT NULL, PoundAmount float DEFAULT NULL, Location varchar(255) DEFAULT NULL, MinorCategory int(11) NOT NULL, Comment varchar(255) DEFAULT NULL, Recurring bit(1) NOT NULL DEFAULT b'0', Estimate bit(1) NOT NULL DEFAULT b'0', PRIMARY KEY (idExpenseTransactions), KEY MinorCategory (MinorCategory) ) ENGINE=InnoDB AUTO_INCREMENT=4687 DEFAULT CHARSET=utf8;

A clean record looks like this: '2924', '2013-11-01 00:00:00', '60', NULL, 'George', '66', 'Lawn Maintenance', '1', '0'

The hex bytes associated with this record are next. I am pretty certain have more bytes than necessary to recreate the record, but I have marked what I believe is the id field to try to give some reference point. 10 06 02 00 01 70 00 41 80 00 0B 6C 00 00 00 00 07 05 86 00 00 01 4A 0E B1 80 00 12 4F 23 1F C1 40 00 00 70 42 47 65 6F 72 67 65 80 00 00 42 4C 61 77 6E 20 4D 61 69 6E 74 65 6E 61 6E 63 65 01 00

I can fathom out the strings easily enough and I can pick out the 4 bytes making up the MinorCategory. The last 2 bytes should represent the 2 bit values. The rest is more difficult.


Solution

  • The record in question is correctly identified, and per my blog post The physical structure of records in InnoDB, here's how it decodes:

    Header:
    10                          Length of Comment = 16 bytes
    06                          Length of Location = 6 bytes
    02                          Nullable field bitmap (PoundAmount = NULL)
    00                          Info flags and number of records owned
    01 70                       Heap number and record type
    00 41                       Offset to next record = +65 bytes
    
    Record:
    80 00 0B 6C                 idExpenseTransactions = 2924
    00 00 00 00 07 05           TRX_ID
    86 00 00 01 4A 0E B1        ROLL_PTR
    80 00 12 4F 23 1F C1 40     TransactionDate = "2013-11-01 00:00:00"
    00 00 70 42                 DollarAmount = 60.0
                                (No data, PoundAmount = NULL)
    47 65 6F 72 67 65           Location = "George"
    80 00 00 42                 MinorCategory = 66
    4C 61 77 6E 20 4D 61 69     Comment = "Lawn Maintenance"
    6E 74 65 6E 61 6E 63 65     (Comment continues...)
    01                          Recurring = 1
    00                          Estimate = 0