Search code examples
scalabinary-serialization

Scala bitfields serialization


I am extremely new at Scala and I'm getting confused by the bit manipulation features. I am hoping someone can point me in the right direction?

I have a byte array defined with the following bit fields:

0-3 - magic number
  4 - version
5-7 - payload length in bytes
8-X - payload, of variable length, as indicated in bits 5-7

I would like to serialize this back and forth to a structure such as:

MagicNumber: Integer
Version: Integer
Length: Integer
payload: Array[Byte]

How do you deal with bits in this situation optimally? Most of the examples I've seen deal with higher level serialization, such as JSON. I am trying to serialize and deserialize TCP binary data in this case.


Solution

  • You can use Scala Pickling or POF or Google Protobuf, but if your format is so restricted, the simplest way is to write your own serializer:

    case class Data(magicNumber: Int, version: Int, payload: Array[Byte])
    
    def serialize(data: Stream[Data]): Stream[Byte] = 
       data.flatMap(x => 
         Array((x.magicNumber << 4 | x.version << 3 | x.payload.length).toByte) ++ x.payload)
    
    @scala.annotation.tailrec
    def deserialize(binary: Stream[Byte], acc: Stream[Data] = Stream[Data]()): Stream[Data] =   
       if(binary.nonEmpty) {
         val magicNumber = binary.head >> 4 
         val version = (binary.head & 0x08) >>3 
         val size = binary.head & 0x07
         val data = Data(magicNumber, version, ByteVector(binary.tail.take(size).toArray)) 
         deserialize(binary.drop(size + 1), acc ++ Stream(data)) 
       } else acc
    

    Or you can use Scodec library (this option is better because you will have automatical value range check):

    Sbt:

      libraryDependencies += "org.typelevel" %% "scodec-core" % "1.3.0"
    

    Codec:

      case class Data(magicNumber: Int, version: Int, payload: ByteVector)
      val codec = (uint(4) :: uint(1) :: variableSizeBytes(uint(3), bytes)).as[Data]
    

    Use:

      val encoded = codec.encode(Data(2, 1, bin"01010101".bytes)).fold(sys.error, _.toByteArray)
      val decoded = codec.decode(BitVector(encoded)).fold(sys.error, _._2)