Search code examples
scalacats-effect

parSequence and parTraverse in tagless final


Using tagless final (without using IO, but rather a generic F) how can I abstract over something like this:

def doSomething(s: String): IO[Unit] = ???

List("authMethods", "secretEngines", "plugins", "CAs", "common").parTraverse(doSomething)

The closest I can get is using parTraverseN from the Concurrent object, but I assume this will run concurrently instead of in parallel (as in parallelism). It also forces me to choose an n where as parTraverse does not.

The size of the list is just an example, it could be way bigger. doSomething is a pure function, multiple executions of it can run in parallel without problems.

Ideally, given that doSomething returns IO[Unit] I would like to abstract over parTraverse_ to an F with the correct typeclass instance.


Solution

  • Here's a similar complete working example:

    import cats.Applicative
    import cats.instances.list._
    import cats.syntax.foldable._
    
    trait Service[F[_]] {
      val items = List("authMethods", "secretEngines", "plugins", "CAs", "common")
    
      def doSomething(s: String): F[Unit] = ???
    
      def result(implicit F: Applicative[F]): F[Unit] =
        items.traverse_(doSomething)
    }
    

    If you want to use parTraverse_ here, the minimal changes necessary would look something like this:

    import cats.{Applicative, Parallel}
    import cats.instances.list._
    import cats.syntax.parallel._
    
    trait Service[F[_]] {
      val items = List("authMethods", "secretEngines", "plugins", "CAs", "common")
    
      def doSomething(s: String): F[Unit] = ???
    
      def result(implicit F: Applicative[F], P: Parallel[F]): F[Unit] =
        items.parTraverse_(doSomething)
    }
    

    Alternatively you could use Parallel.parTraverse_(items)(doSomething) and skip the syntax import. Both approaches require a Foldable instance for List (provided here by the cats.instances.list._ import, which will no longer be necessary in Cats 2.2.0), and a Parallel instance for F, which you get via the P constraint.

    (Note that the Applicative constraint on result is no longer necessary in the second version, but that's only because this is a very simple example—I'm assuming your real code relies on something like Sync instead, and will need both that and Parallel.)

    This answer needs a couple of footnotes, though. The first is that it might not actually be a good thing that parTraverse_ doesn't make you specify a bound in the way that parTraverseN does, and may result in excessive memory use, etc. (but this will depend on e.g. the expected size of your lists and the kind of work doSomething is doing, and is probably outside the scope of the question).

    The second footnote is that "parallel" in the sense of the Parallel type class is more general than the "parallel" in the parallel-vs.-concurrent distinction in the Cats "Concurrency Basics" document. The Parallel type class models a very generic kind of logical parallelism that also encompasses error accumulation, for example. So when you write:

    I assume this will run concurrently instead of in parallel (as in parallelism).

    …your assumption is correct, but not exactly because the parTraverseN method is on Concurrent instead of Parallel; note that Concurrent.parTraverseN still requires a Parallel instance. When you see par or the Parallel type class in the context of cats.effect.Concurrent, you should think of concurrency, not "parallelism" in the "Concurrency Basics" sense.