Search code examples
crystal-lang

How can I stream multiple files at the same time using HTTP::Server?


I'm working on an HTTP service that serves big files. I noticed that parallel downloads are not possible. The process serves only one file at a time and all other downloads are waiting until the previous downloads finish. How can I stream multiple files at the same time?

    require "http/server"

    server = HTTP::Server.new(3000) do |context|
      context.response.content_type = "application/data"
      f = File.open "bigfile.bin", "r"
      IO.copy f, context.response.output
    end

    puts "Listening on http://127.0.0.1:3000"
    server.listen

Request one file at a time:

    $ ab -n 10 -c 1 127.0.0.1:3000/

    [...]
    Percentage of the requests served within a certain time (ms)
     50%      9
     66%      9
     75%      9
     80%      9
     90%      9
     95%      9
     98%      9
     99%      9
    100%      9 (longest request)

Request 10 files at once:

    $ ab -n 10 -c 10 127.0.0.1:3000/

    [...]
    Percentage of the requests served within a certain time (ms)
     50%     52
     66%     57
     75%     64
     80%     69
     90%     73
     95%     73
     98%     73
     99%     73
    100%     73 (longest request)

Solution

  • The problem here is that both File#read and context.response.output will never block. Crystal's concurrency model is based on cooperatively scheduled fibers, where switching fibers only happens when IO blocks. Reading from the disk using nonblocking IO is impossible which means the only part that's possible to block is writing to context.response.output. However, disk IO is a lot lot slower than network IO on the same machine, meaning that writing will never block because ab is reading at a rate much faster than the disk can provide data, even from the disk cache. This example is practically the perfect storm to break crystal's concurrency.

    In the real world, it's much more likely that clients of the service will reside over the network from the machine, making the response write occasionally block. Furthermore, if you were reading from another network service or a pipe/socket you would also block. Another solution would be to use a threadpool to implement nonblocking file IO, which is what libuv does. As a side note, Crystal moved to libevent because libuv doesn't allow a multithreaded event loop (i.e. have any thread resume any fiber).

    Calling Fiber.yield to pass execution to any pending fiber is the correct solution. Here's an example of how to block (and yield) while reading files:

        def copy_in_chunks(input, output, chunk_size = 4096)
          size = 1
          while size > 0
            size = IO.copy(input, output, chunk_size)
            Fiber.yield
          end
        end
    
        File.open("bigfile.bin", "r") do |file|
          copy_in_chunks(file, context.response)
        end
    

    This is a transcription of the dicussion here: https://github.com/crystal-lang/crystal/issues/4628

    Props to GitHub users @cschlack, @RX14 and @ysbaddaden