Search code examples
haskellioperformancebytestring

When do I use ByteString and when do I not?


I've been making rather poor attempts at the PRIME1 problem on SPOJ. I discovered using that using ByteString really helped performance for reading in the problem text. However, using ByteString to write out the results is actually slightly slower than using Prelude functions. I'm trying to figure out if I'm doing it wrong, or if this is expected.

I've conducted profiling and timing using (putStrLn.show) and the ByteString equivalents three different ways:

  1. I test each candidate to see if it is prime. If so, I add it to a list and write it out with (putStrLn . show)
  2. I make a list of all primes and write out the list using (putStrLn . unlines. show)
  3. I make a list of all primes and write out the list using map (putStrLn . show)

I expected numbers 2 and 3 to perform slower as you are building a list in one function and consuming it in another. By printing the numbers as I generate them, I avoid allocating any memory for the list. On the other hand, you are making a call system call with each call to putStrLn. Right? So I tested and #1 was in fact the fastest.

The best performance was achieved with option #1 and the Prelude ([Char]) functions. I expected that my best performance would be option #1 with ByteString, but this was not the case. I only used lazy ByteStrings, but I didn't think this would matter. Would it?

Some questions:

  • would you expect the ByteStrings to perform better for writing a bunch of Integers to stdout?
  • Am I missing a way pattern to generate and write out the answers that would lead to better performance?
  • If I am only writing out numbers as text, when, if ever, is there a benefit to using ByteString?

My working hypothesis is that writing out Integer's with ByteString is slower iff you aren't combining them with other text. If you are combining Integers with [Char], then you'd get better performance working with ByteStrings. I.e., the ByteString rewrite of:

putStrLn $ "the answer is: " ++ (show value)

will be much faster than the version written above. Is this true?

Thanks for reading!


Solution

  • Doing bulk input is usually faster with bytestrings, since the data is dense, there's simply less data to shuffle from the disk into memory.

    Writing data as output however, is a little different. Typically, you're serializing a structure, generating many small writes. So the dense, bulk writes of bytestrings don't help you much in that case. Even regular Strings will do reasonably at incremental output.

    However, all is not lost. We can recover fast bulk writes by efficiently building up bytestrings in memory. This approach is taken by the various *-builder packages:

    Instead of converting values to lots of tiny bytestrings, and writing them out one at a time, we stream the conversion into an ever-growing buffer, and in turn, write that buffer in one big piece. This results in a lot less IO overhead, and performance improvements (often signficant) over string IO.

    This kind of approach is taken by e.g. webservers in Haskell, or the efficient HTML system, blaze.

    Also, the performance, even with bulk writes, will depend on the efficiency of whatever conversion function you have between your types and bytestrings. For Integer, you could be simply copying the bit pattern in memory to output, or instead going through some inefficient decoder. As a result, you sometimes have to think a bit about the quality of the encoding function you're using, and not just whether to use Char/String or bytestring IO.