How should I handle a long-running HTTP request, using Scotty (Haskell)?

I'm making a simple web app that looks for color words in a text, and plots statistics about them. You can test it at colors.jonreeve.com if it's not too busy. I'm using the Scotty web framework to handle the web stuff. It works OK for short texts, but longer texts, like full novels, take so long that the browser normally times out. So I'm guessing what I need here is to send the form via Jquery AJAX or something, and then have the server send JSON every so often with its status ("now loading file," "now counting colors," etc) and then when it receives a "success" signal, then redirect to some other URL?

This is my first time trying to do something like this, so forgive me if this all sounds uninformed. I also noticed that there are some similar questions out there, but I have a feeling that Scotty handles things a little differently than most setups. I noticed that there are a few functions for setting raw output, setting headers and so forth. Do I try to emit certain signals at each stage in the analysis? And how would I do that, given Haskell's handling of side-effects? I'm struggling to even think of the best approach, here.

Solution

Instead of a single long-running GET request, I would perhaps set up an endpoint accepting POST requests. The POST would return immediately with two links in the response body:

one link to a new resource representing the task result, which wouldn't be immediately available. Until then, GET requests to the result could return 409 (Conflict).
one link to a related, immediately available resource representing notifications emitted while performing the task.

Once the client has made a successful GET of the task result resource, it could DELETE it. That should delete both the task result resource and the associated notification resource.

For each POST request, you would need to spawn a background worker thread. You would also need a background thread for deleting task results that grew old (because the clients could be lazy and not invoke DELETE). These threads would communicate with MVars, TVars, channels or similar methods.

Now the question is: how to best handle the notifications emitted by the server? There are several options:

Just poll periodically the notification resource from the client. Disadvantages: potentially many HTTP requests, notifications are not received promptly.
long polling. A sequence of GET requests which are kept open until the server wants to emit some notification, or until a timeout.
server-sent events. wai-extra has support for this, but I don't know how to hook a raw wai Application back into Scotty.
websockets. Not sure how to integrate with Scotty though.

Here's the server-side skeleton of a long polling mechanism. Some preliminary imports:

{-# LANGUAGE NumDecimals #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeApplications #-}
import Control.Concurrent (threadDelay)
import Control.Concurrent.Async (concurrently_) -- from async
import Control.Concurrent.STM -- from stm
import Control.Concurrent.STM.TMChan -- from stm-chans
import Control.Monad.IO.Class (liftIO)
import Data.Aeson (ToJSON) -- from aeson
import Data.Foldable (for_)
import Data.Text (Text) 
import Web.Scotty

And here is the main code.

main :: IO ()
main =
  do
    chan <- atomically $ newTMChan @Text
    concurrently_
      ( do
          for_
            ["starting", "working on it", "finishing"]
            ( \msg -> do
                threadDelay 10e6
                atomically $ writeTMChan chan msg
            )
          atomically $ closeTMChan chan
      )
      ( scotty 3000
          $ get "/notifications"
          $ do
            mmsg <- liftIO $ atomically $ readTMChan chan
            json $
              case mmsg of
                Nothing -> ["closed!"]
                Just msg -> [msg]
      )

There are two concurrent threads. One feeds messages into a closeable channel at 10 second intervals, the other runs a Scotty server, where each GET invocation hangs until a new message arrives in the channel.

Testing it from bash using curl, we should see a succession of messages:

bash$ for run in {1..4}; do curl -s localhost:3000/notifications ; done
["starting"]["working on it"]["finishing"]["closed!"]