Given a (strict) ByteString, what is the most efficient way to count how many of each possible byte it contains?
I see that sort
is supposed to implemented as a counting sort - but there doesn't appear to be a way to access the associated counts. I also see that there's a count
function, which counts the number of times a given byte appears. This gives me the following options:
map (\ b -> count b str) [0x00 .. 0xFF]
map length . group . sort
fold*
and an IntMap
of byte frequencies.Which is likely to give me best performance?
The basic idea of dflemstr is of course right, but since you want the best performance, you need to use unchecked access to the ByteString
as well as to the counting array, like
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Base (unsafeRead, unsafeWrite)
import Data.Array.Unboxed
import Data.Word
import Data.ByteString (ByteString)
import qualified Data.ByteString as BS
import Data.ByteString.Unsafe
histogram :: ByteString -> UArray Word8 Int
histogram bs = runSTUArray $ do
hist <- newArray (0, 255) 0
let l = BS.length bs
update b = do
o <- unsafeRead hist b
unsafeWrite hist b (o+1)
loop i
| i < 0 = return hist
| otherwise = do
update $ fromIntegral (bs `unsafeIndex` i)
loop (i-1)
loop (l-1)
That makes a considerable difference, according to criterion
(building a histogram of a 200000 long ByteString
):
warming up
estimating clock resolution...
mean is 1.667687 us (320001 iterations)
found 3078 outliers among 319999 samples (1.0%)
1947 (0.6%) high severe
estimating cost of a clock call...
mean is 40.43765 ns (14 iterations)
benchmarking dflemstr
mean: 21.42852 ms, lb 21.05213 ms, ub 21.77954 ms, ci 0.950
std dev: 1.873897 ms, lb 1.719565 ms, ub 2.038779 ms, ci 0.950
variance introduced by outliers: 74.820%
variance is severely inflated by outliers
benchmarking unsafeIndex
mean: 312.6447 us, lb 304.3425 us, ub 321.0795 us, ci 0.950
std dev: 42.86886 us, lb 39.64363 us, ub 46.52899 us, ci 0.950
variance introduced by outliers: 88.342%
variance is severely inflated by outliers
(I changed dflemstr's code to also use runSTUArray
and return a UArray Word8 Int
to have uiform return values, that doesn't make a big difference in running time, though.)