Search code examples
arraysswiftoptimizationnsdatamask

Best Data get/set uint8 at index / Data masking


Im trying to create Data mask function.

I found two ways:

  1. using data subscripts

    • very slow
  2. creating array from data, change it and then convert it back

    • ~70 times faster
    • uses 2 times more memory

Why Data subscripting is so slow? Is there a better way to get/set uint8 at index without duplicating memory?

here is my test:

var data = Data(bytes: [UInt8](repeating: 123, count: 100_000_000))

let a = CFAbsoluteTimeGetCurrent()

// data masking
for i in 0..<data.count {
  data[i] = data[i] &+ 1
}

let b = CFAbsoluteTimeGetCurrent()

// creating array
var bytes = data.withUnsafeBytes {
  [UInt8](UnsafeBufferPointer(start: $0, count: data.count))
}
for i in 0..<bytes.count {
  bytes[i] = bytes[i] &+ 1
}
data = Data(bytes: bytes)

let c = CFAbsoluteTimeGetCurrent()
print(b-a) // 8.8887130022049
print(c-b) // 0.12415999174118

Solution

  • I cannot tell you exactly why the first method (via subscripting the Data value) is so slow. According to Instruments, a lot of time is spend in objc_msgSend, when calling methods on the underlying NSMutableData object.

    But you can mutate the bytes without copying the data to an array:

    data.withUnsafeMutableBytes { (bytes: UnsafeMutablePointer<UInt8>) -> Void in
        for i in 0..<data.count {
            bytes[i] = bytes[i] &+ 1
        }
    }
    

    which is even faster than your "copy to array" method.

    On a MacBook I got the following results:

    • Data subscripting: 7.15 sec
    • Copy to array and back: 0.238 sec
    • withUnsafeMutableBytes: 0.0659 sec