I am extremely new at Scala and I'm getting confused by the bit manipulation features. I am hoping someone can point me in the right direction?
I have a byte array defined with the following bit fields:
0-3 - magic number
4 - version
5-7 - payload length in bytes
8-X - payload, of variable length, as indicated in bits 5-7
I would like to serialize this back and forth to a structure such as:
MagicNumber: Integer
Version: Integer
Length: Integer
payload: Array[Byte]
How do you deal with bits in this situation optimally? Most of the examples I've seen deal with higher level serialization, such as JSON. I am trying to serialize and deserialize TCP binary data in this case.
You can use Scala Pickling or POF or Google Protobuf, but if your format is so restricted, the simplest way is to write your own serializer:
case class Data(magicNumber: Int, version: Int, payload: Array[Byte])
def serialize(data: Stream[Data]): Stream[Byte] =
data.flatMap(x =>
Array((x.magicNumber << 4 | x.version << 3 | x.payload.length).toByte) ++ x.payload)
@scala.annotation.tailrec
def deserialize(binary: Stream[Byte], acc: Stream[Data] = Stream[Data]()): Stream[Data] =
if(binary.nonEmpty) {
val magicNumber = binary.head >> 4
val version = (binary.head & 0x08) >>3
val size = binary.head & 0x07
val data = Data(magicNumber, version, ByteVector(binary.tail.take(size).toArray))
deserialize(binary.drop(size + 1), acc ++ Stream(data))
} else acc
Or you can use Scodec library (this option is better because you will have automatical value range check):
Sbt:
libraryDependencies += "org.typelevel" %% "scodec-core" % "1.3.0"
Codec:
case class Data(magicNumber: Int, version: Int, payload: ByteVector)
val codec = (uint(4) :: uint(1) :: variableSizeBytes(uint(3), bytes)).as[Data]
Use:
val encoded = codec.encode(Data(2, 1, bin"01010101".bytes)).fold(sys.error, _.toByteArray)
val decoded = codec.decode(BitVector(encoded)).fold(sys.error, _._2)