Search code examples
scodec

Encoding vector length field not adjacent to the vector


I have the following structure I like to encode. I'm aware that I can encode a vector with vector() if the size field is directly in front of the vector data. But here the field encoding the vector size is not adjacent.

case class Item(
    address: Int,
    size: Int,
)
case class Header {
    // lots of other fields before
    numOfItems: Int,
    // lots of other fields after
}
case class Outer(
    hdr: Header,
    items: Vector[]
)

Decoding of Outer is OK:

Header.numOfItems is read from the bit vector and items is created with vectorOfN(provide(hdr.numOfItems, Item.codec))

Encoding of Outer is the problem:

When encoding I would like to have numOfItem be taken from the items.length. I'm aware that I could set numOfItems with additional code when the items Vector is updated or with something like a "before encoding callback".

The question is if there is a more elegant solution. To me Header.numOfItems is redundant with Outer.items.length, so ideally only the Encoder should know about numOfItems.


Solution

  • You could try building a Codec using consume() and start without building the Outer object:

    case class OuterExpanded(
      fieldBefore: Int, // Field before number of items in the binary encoding
      fieldAdter: Int,  // Field after number of items in the binary encoding
      items: Vector[Item] // Encoded items
    )
    
    // Single Item codec
    def itemC: Codec[Item] = (int32 :: int32).as[Item] 
    
    def outerExpandedC: Codec[OuterExpanded] = ( 
      int32 ::                          // Field before count
      int32.consume( c =>               // Item count 
          int32 ::                      // Field after count
          vectorOfN(provide(c), itemC))   // 'consume' (use and forget) the count
        (_.tail.head.length)              // provide the length when encoding
      ).as[OuterExpanded]
    

    As defined above, you get the following when encoding: outerExpandedC.encode(OuterExpanded(-1, -1, Vector(Item(1,2), Item(3,4)))) returns

    Successful(BitVector(224 bits, 
         0xffffffff00000002fffffffe00000001000000020000000300000004))
                  ^       ^       ^       ^-------^-> First Item
                  |-1     |       |-2
                          |Vector length inserted between the two header fields
    

    Afterwards, you can xmap() the Codec[OuterExpanded] to pack the other header fields together into their own object. Ie (adding two conversion methods to Outer and OuterExpanded):

    def outerC: Codec[Outer] = 
      outerExpandedC.xmap(_.toOuter,_.expand)
    
    case class OuterExpanded(fieldBefore: Int, fieldAfter: Int,  items: Vector[Item]) {
      def toOuter = Outer(Hdr(fieldBefore,fieldAfter), items)
    }
    
    case class Outer(header: Hdr, items: Vector[Item]) {
      def expand = OuterExpanded(header.beforeField1, header.beforeField1, items)
    }
    

    This can probably be adapted to more complex cases, though I'm not entirely familar with shapeless' heterogeneous lists – or HList – and there might be nicer ways to get to the length of the vector rather than calling _.tail.head.length in the example above, especially if you end up with more than one field after the number of encoded values.

    Also, the Codec scaladoc is a nice place to discover useful operators