Search code examples
nim-lang

Reversing Bytes and cross compatible binary parsing in Nim


I've started taking a look at Nim for hobby game modding purposes.

Intro

Yet, I found it difficult to work with Nim compared to C when it comes to machine-specific low-level memory layout and would like to know if Nim actually has better support here.

I need to control byte order and be able to de/serialize arbitrary Plain-Old-Datatype objects to binary custom file formats. I didn't directly find a Nim library which allows flexible storage options like representing enum and pointers with Big-Endian 32-bit. Or maybe I just don't know how to use the feature.

  • std/marshal : just JSON, i.e. no efficient, flexible nor binary format but cross-compatible
  • nim-serialization : seems like being made for human readable formats
  • nesm : flexible cross-compatibility? (It has some options and has a good interface)
  • flatty : no flexible cross-compatibility, no byte order?
  • msgpack4nim : no flexible cross-compatibility, byte order?
  • bingo : ?

Flexible cross-compatibility means, it must be able to de/serialize fields independently of Nim's ABI but with customization options.

Maybe "Kaitai Struct" is more what I look for, a file parser with experimental Nim support.

TL;DR

As a workaround for a serialization library I tried myself at a recursive "member fields reverser" that makes use of std/endians which is almost sufficient.

But I didn't succeed with implementing byte reversal of arbitrarily long objects in Nim. Not practically relevant but I still wonder if Nim has a solution.

I found reverse() and reversed() from std/algorithm but I need a byte array to reverse it and turn it back into the original object type. In C++ there would be reinterprete_cast, in C there is void*-cast, in D there is a void[] cast (D allows defining array slices from pointers) but I couldn't get it working with Nim.

I tried cast[ptr array[value.sizeof, byte]](unsafeAddr value)[] but I can't assign it to a new variable. Maybe there was a different problem.

How to "byte reverse" arbitrary long Plain-Old-Datatype objects?

How to serialize to binary files with byte order, member field size, pointer as file "offset - start offset"? Are there bitfield options in Nim?


Solution

  • It is indeed possible to use algorithm.reverse and the appropriate cast invocation to reverse bytes in-place:

    import std/[algorithm,strutils,strformat]
    
    type
      LittleEnd{.packed.} = object
        a: int8
        b: int16
        c: int32
      BigEnd{.packed.} = object
        c: int32
        b: int16
        a: int8
    
    ## just so we can see what's going on:
    proc `$`(b: LittleEnd):string = &"(a:0x{b.a.toHex}, b:0x{b.b.toHex}, c:0x{b.c.toHex})"
    proc `$`(l:BigEnd):string = &"(c:0x{l.c.toHex}, b:0x{l.b.toHex}, a:0x{l.a.toHex})"
    
    
    var lit = LittleEnd(a: 0x12, b:0x3456, c: 0x789a_bcde)
    echo lit # (a:0x12, b:0x3456, c:0x789ABCDE)
    
    var big:BigEnd
    
    copyMem(big.addr,lit.addr,sizeof(lit))
    
    # here's the reinterpret_cast you were looking for:
    cast[var array[sizeof(big),byte]](big.addr).reverse
    
    echo big # (c:0xDEBC9A78, b:0x5634, a:0x12)
    

    for C-style bitfields there is also the {.bitsize.} pragma but using it causes Nim to lose sizeof information, and of course bitfields wont be reversed within bytes

    import std/[algorithm,strutils,strformat]
    
    type
      LittleNib{.packed.} = object
        a{.bitsize: 4}: int8
        b{.bitsize: 12}: int16
        c{.bitsize: 20}: int32
        d{.bitsize: 28}: int32
      BigNib{.packed.} = object
        d{.bitsize: 28}: int32
        c{.bitsize: 20}: int32
        b{.bitsize: 12}: int16
        a{.bitsize: 4}: int8
    const nibsize = 8
    
    proc `$`(b: LittleNib):string = &"(a:0x{b.a.toHex(1)}, b:0x{b.b.toHex(3)}, c:0x{b.c.toHex(5)}, d:0x{b.d.toHex(7)})"
    proc `$`(l:BigNib):string = &"(d:0x{l.d.toHex(7)}, c:0x{l.c.toHex(5)}, b:0x{l.b.toHex(3)}, a:0x{l.a.toHex(1)})"
    var lit = LitNib(a: 0x1,b:0x234, c:0x56789, d: 0x0abcdef)
    echo lit # (a:0x1, b:0x234, c:0x56789, d:0x0ABCDEF)
    
    
    var big:BigNib
    
    copyMem(big.addr,lit.addr,nibsize)
    cast[var array[nibsize,byte]](big.addr).reverse
    echo big # (d:0x5DEBC0A, c:0x8967F, b:0x123, a:0x4)
    

    It's less than optimal to copy the bytes over, then rearrange them with reverse, anyway, so you might just want to copy the bytes over in a loop. Here's a proc that can swap the endianness of any object, (including ones for which sizeof is not known at compiletime):

    template asBytes[T](x:var T):ptr UncheckedArray[byte] = 
      cast[ptr UncheckedArray[byte]](x.addr)
    
    proc swapEndian[T,U](src:var T,dst:var U) =
      assert sizeof(src) == sizeof(dst)
      let len = sizeof(src)
      for i in 0..<len:
        dst.asBytes[len - i - 1] = src.asBytes[i]