Search code examples
pythonstructpywin32deviceiocontrol

Struct unpack on win32file.DeviceIoControl


I am trying to understand and work with win32file. I need to grab USN Journals and having a hard time understanding code snippets I found online. This is the code snippet I found -

format = 'qqqqqLLLLqqqqq'
length = struct.calcsize(format)
out_buffer = win32file.DeviceIoControl(volh, winioctlcon.FSCTL_GET_NTFS_VOLUME_DATA, None, length)
data = struct.unpack(format, out_buffer)

Now I am really rusty when it comes to C and it's structures. What I have understood for now is format is 96 bytes buffer and it'll get the output from DeviceIoControl

So I tried to change the format to 'QQQQQQQQQQQQQQQQQQQ' to see what happens(to see because I am kind of clueless what might actually happen) and it turns out I got a larger out_buffer this time. So I thought to unpack it -

struct.unpack(format, out_buffer)

And surprise to me, I got -

struct.error: unpack requires a string argument of length 152

So I added another 'Q' to increase the size and got the same result. I don't understand why 'qqqqqLLLLqqqqq' works and 'QQQQQQQQQQQQQQQQQQQ' does not. So my questions are -

  • My understanding was we can unpack if buffer was larger than the output so why doesn't the unpack work?

  • Would I have to remember these formats every-time I want to get something out from DeviceIoControl?

Pointing me out to resources would also be an added bonus as I need to build on the code to read USN Journals and I don't think hit-and-try is going to get me anywhere


Solution

  • Let's split the problem in smaller pieces and take each one at a time.

    • win32file module is part of [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions which is a Python wrapper over WinAPIs

    • DeviceIoControl acts differently, depending on the dwIoControlCode (2nd argument). For FSCTL_GET_NTFS_VOLUME_DATA, it fills a buffer with volume specific data. From [MS.Docs]: FSCTL_GET_NTFS_VOLUME_DATA IOCTL:

      lpOutBuffer
      A pointer to the output buffer, an NTFS_VOLUME_DATA_BUFFER (@CristiFati: !!! Broken URL !!!) structure. The file record associated with the file identifier specified in the input buffer is returned in this buffer. Refer to the Remarks section of the documentation for the NTFS_VOLUME_DATA_BUFFER structure for specific information on how to determine the correct size of this buffer.

      Here's an alternative to the above broken URL: [MSDN]: NTFS_VOLUME_DATA_BUFFER structure. As I'm not sure for how long will it be valid, I'm pasting the structure definition below (from Windows Kits 8.1: winioctl.h (line #4987)):

      typedef struct {
      
          LARGE_INTEGER VolumeSerialNumber;
          LARGE_INTEGER NumberSectors;
          LARGE_INTEGER TotalClusters;
          LARGE_INTEGER FreeClusters;
          LARGE_INTEGER TotalReserved;
          DWORD BytesPerSector;
          DWORD BytesPerCluster;
          DWORD BytesPerFileRecordSegment;
          DWORD ClustersPerFileRecordSegment;
          LARGE_INTEGER MftValidDataLength;
          LARGE_INTEGER MftStartLcn;
          LARGE_INTEGER Mft2StartLcn;
          LARGE_INTEGER MftZoneStart;
          LARGE_INTEGER MftZoneEnd;
      
      } NTFS_VOLUME_DATA_BUFFER, *PNTFS_VOLUME_DATA_BUFFER;
      
    • [Python 3.Docs]: struct - Interpret bytes as packed binary data module, is used for conversions between binary and "normal" data. It contains all the format characters meanings (q, Q, L, ...), and much more. You could also take a look at [SO]: Python struct.pack() behavior for more (practical) details

    After going over the above materials, things should become clearer.

    A couple of notes:

    • If one doesn't know what a function does (returns), they should probably don't use it (without reading the manual, of course). Although nowadays, both Win (which always had a lot of restrictions for the regular user) and Nix "protect users from themselves" (e.g.: root login no longer allowed, write protect %SystemDrive%, ...)
    • The attempts (trial and error) show some lack of experience (probably everyone does it at some point, the key is not to rely solely on it)
    • "Would I have to remember these formats every-time I want to get something out from DeviceIoControl"?
      • Again, if not knowing that a function does, what's the reason for calling it? If you meant learning NTFS_VOLUME_DATA_BUFFER by heart, it's definitely not the case. You should know its structure only when using it (and as you've noticed there are some places that you can get it from - including this very post :) )
    • "My understanding was we can unpack if buffer was larger than the output so why doesn't the unpack work?"
      • Your understanding is correct. But win32file.DeviceIoControl seems to sometimes (probably when reaching 1st NULL after 96 bytes) truncate the output buffer when passing a value greater than the expected one (via length argument). When passing a smaller one, it will fail (as expected)

    I've also prepared a dummy Python example.

    code00.py:

    #!/usr/bin/env python3
    
    import sys
    import struct
    import win32file
    import win32api
    import win32con
    import winioctlcon
    
    
    VOLUME_LETTER = "E"
    
    FILE_READ_ATTRIBUTES = 0x0080
    FILE_EXECUTE = 0x0020
    
    vol_data_buf_fmt = "qqqqqLLLLqqqqq"  # This is the format that matches NTFS_VOLUME_DATA_BUFFER definition (96 bytes). Note: Instead of each 'q' you could also use 'Ll' as 'LARGE_INTEGER' is an union
    
    BINARY_FORMAT_LIST = [
        vol_data_buf_fmt,
        "QQQQQQQQQQQQQQQQQQQ",
    ]
    
    
    def print_formats():  # Dummy func
        print("Formats and lengths:")
        for format in BINARY_FORMAT_LIST:
            print("    {:s}: {:d}".format(format, struct.calcsize(format)))
    
    
    def main():
        #print_formats()
        vol_unc_name = "\\\\.\\{:s}:".format(VOLUME_LETTER)
        print("volume: ", vol_unc_name)
        access_flags = FILE_READ_ATTRIBUTES | FILE_EXECUTE  # Apparently, doesn't work without FILE_EXECUTE
        share_flags = win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE  # Doesn't work withou FILE_SHARE_WRITE
        creation_flags = win32con.OPEN_EXISTING
        attributes_flags = win32con.FILE_ATTRIBUTE_NORMAL
        vol_handle = win32file.CreateFile(vol_unc_name, access_flags, share_flags, None, creation_flags, attributes_flags, None)
    
        buf_len = struct.calcsize(vol_data_buf_fmt)
        for i in [buf_len]:
            print("    Passing a buffer size of: {:d}".format(i))
            buf = win32file.DeviceIoControl(vol_handle, winioctlcon.FSCTL_GET_NTFS_VOLUME_DATA, None, i)
            print("    DeviceIocontrol returned a {:d} bytes long {:}".format(len(buf), type(buf)))
            out = struct.unpack_from(vol_data_buf_fmt, buf)
            print("\n    NumberSectors: {:}\n    TotalClusters: {:}\n    BytesPerCluster: {:}".format(out[1], out[2], out[6]))
        win32api.CloseHandle(vol_handle)
    
    
    if __name__ == "__main__":
        print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
        main()
    

    Output:

    (py35x64_test) e:\Work\Dev\StackOverflow\q053318932>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" ./code00.py
    Python 3.5.4 (v3.5.4:3f56838, Aug  8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)] on win32
    
    volume:  \\.\E:
        Passing a buffer size of: 96
        DeviceIocontrol returned a 96 bytes long <class 'bytes'>
    
        NumberSectors: 494374911
        TotalClusters: 61796863
        BytesPerCluster: 4096
    

    Needless to say that multiplying TotalClusters by BytesPerCluster, I get the correct bytes number (as reported by Win) for my E: drive.