I came across this "32-bit integer" method of reading and applying the XOR mask in a web socket and tried it out in my very simple local server for a desktop application and wondered if it should really be ten times quicker.
It is at https://wiki.tcl-lang.org/page/WebSocket+Client+Library?R=0 under the section of code at proc ::websocket::__mask { mask dta }
I modified it very slightly to proc xor32
, below, but was not expecting it to be that much faster, maybe 4 times but not 10. The proc xor
is the method shown in all the other instructions I could find when first researching how to receive messages over a web socket.
My question is, Is this a good approach and truly expected to be 10 times faster, or am I misinterpreting/misunderstanding the results?
Thank you.
proc xor {mask input} {
binary scan $mask cu4 mask_key
binary scan $input cu* pre_xor
set offset -1
set post_xor {}
foreach b $pre_xor {
append post_xor \
"[expr {$b ^ [lindex $mask_key [expr {[incr offset] % 4}]]}] "
}
return [binary format cu* $post_xor]
}
proc xor32 { mask input } {
# Format data as a list of 32-bit integer
# words and list of 8-bit integer byte leftovers. Then unmask
# data, recombine the words and bytes, and return
binary scan $mask Iu mask_key
binary scan $input I*c* words bytes
set masked_words {}
set masked_bytes {}
foreach word $words {
lappend masked_words [expr {$word ^ $mask_key}]
}
set i -1
foreach byte $bytes {
lappend masked_bytes\
[expr {$byte ^ ($mask_key >> (24 - 8 * [incr i]))}]
}
return [binary format I*c* $masked_words $masked_bytes]
}
set filename {book.html}
set fp [open $filename r];
set size [file size $filename]
puts "message size: $size"
set message [chan read $fp]
set maskKeys {100 42 9 67}
# set binmask [binary format I1 $maskKeys]
set binmask [binary format cu4 $maskKeys]
set encoded [xor $binmask $message]
#puts [xor $binmask $encoded]
puts "xor avg time 100 iterations: [time {xor $binmask $encoded} 100]"
set encoded [xor32 $binmask $message]
#puts [xor32 $binmask $encoded]
puts "xor32 avg time 100 iterations: [time {xor32 $binmask $encoded} 100]"
# message size: 1063714
# xor avg time 100 iterations: 299470.44 microseconds per iteration
# xor32 avg time 100 iterations: 28932.12 microseconds per iteration
Count up the number of Tcl commands needed to process each byte in the different approaches and the performance makes sense. Extending the principle to processing 64 bit chunks at a time with xor64
as:
proc xor64 { mask input } {
# Format data as a list of 32-bit integer
# words and list of 8-bit integer byte leftovers. Then unmask
# data, recombine the words and bytes, and return
binary scan $mask Iu mask_key
binary scan $input W*c* qwords bytes
set masked_qwords {}
set masked_bytes {}
set qmask_key [expr {$mask_key || ($mask_key << 32)}]
foreach qword $qwords {
lappend masked_qwords [expr {$qword ^ $qmask_key}]
}
set i -1
foreach byte $bytes {
lappend masked_bytes\
[expr {$byte ^ ($qmask_key >> (56 - 8 * [incr i]))}]
}
return [binary format W*c* $masked_qwords $masked_bytes]
}
you can see that the pattern continues to hold, and is roughly proportional to the reduction in the number of Tcl commands:
message size: 1549
xor avg time 100 iterations: 285.66 microseconds per iteration 775504 Tcl commands
xor32 avg time 100 iterations: 25.9 microseconds per iteration 78904 Tcl commands
xor64 avg time 100 iterations: 11.45 microseconds per iteration 41504 Tcl commands