Using scala-cats IO type to encapsulate a mutable Java library

I understand that generally speaking there is a lot to say about deciding what one wants to model as effect This discussion is introduce in Functional programming in Scala on the chapter on IO.

Nonethless, I have not finished the chapter, i was just browsing it end to end before takling it together with Cats IO.

In the mean time, I have a bit of a situation for some code I need to deliver soon at work. It relies on a Java Library that is just all about mutation. That library was started a long time ago and for legacy reason i don't see them changing.

Anyway, long story short. Is actually modeling any mutating function as IO a viable way to encapsulate a mutating java library ?

Edit1 (at request I add a snippet)

Readying into a model, mutate the model rather than creating a new one. I would contrast jena to gremlin for instance, a functional library over graph data.

def loadModel(paths: String*): Model =
    paths.foldLeft(ModelFactory.createOntologyModel(new OntModelSpec(OntModelSpec.OWL_MEM)).asInstanceOf[Model]) {
      case (model, path) ⇒
        val input = getClass.getClassLoader.getResourceAsStream(path)
        val lang  = RDFLanguages.filenameToLang(path).getName
        model.read(input, "", lang)
    }

That was my scala code, but the java api as documented in the website look like this.

// create the resource
Resource r = model.createResource();

// add the property
r.addProperty(RDFS.label, model.createLiteral("chat", "en"))
 .addProperty(RDFS.label, model.createLiteral("chat", "fr"))
 .addProperty(RDFS.label, model.createLiteral("<em>chat</em>", true));

// write out the Model
model.write(system.out);

// create a bag
Bag smiths = model.createBag();

// select all the resources with a VCARD.FN property
// whose value ends with "Smith"
StmtIterator iter = model.listStatements(
    new SimpleSelector(null, VCARD.FN, (RDFNode) null) {
        public boolean selects(Statement s) {
                return s.getString().endsWith("Smith");
        }
    });
// add the Smith's to the bag
while (iter.hasNext()) {
    smiths.add(iter.nextStatement().getSubject());
}

Solution

So, there are three solutions to this problem.

1. Simple and dirty

If all the usage of the impure API is contained in single / small part of the code base, you may just "cheat" and do something like:

def useBadJavaAPI(args): IO[Foo] = IO {
  // Everything inside this block can be imperative and mutable.
}

I said "cheat" because the idea of IO is composition, and a big IO chunk is not really composition. But, sometimes you only want to encapsulate that legacy part and do not care about it.

2. Towards composition.

Basically, the same as above but dropping some flatMaps in the middle:

// Instead of:
def useBadJavaAPI(args): IO[Foo] = IO {
  val a = createMutableThing()
  mutableThing.add(args)
  val b = a.bar()
  b.computeFoo()
}

// You do something like this:
def useBadJavaAPI(args): IO[Foo] =
  for {
    a <- IO(createMutableThing())
    _ <- IO(mutableThing.add(args))
    b <- IO(a.bar())
    result <- IO(b.computeFoo())
  } yield result

There are a couple of reasons for doing this:

Because the imperative / mutable API is not contained in a single method / class but in a couple of them. And the encapsulation of small steps in IO is helping you to reason about it.
Because you want to slowly migrate the code to something better.
Because you want to feel better with yourself :p

3. Wrap it in a pure interface

This is basically the same that many third party libraries (e.g. Doobie, fs2-blobstore, neotypes) do. Wrapping a Java library on a pure interface.

Note that as such, the amount of work that has to be done is way more than the previous two solutions. As such, this is worth it if the mutable API is "infecting" many places of your codebase, or worse in multiple projects; if so then it makes sense to do this and publish is as an independent module.
(it may also be worth to publish that module as an open-source library, you may end up helping other people and receive help from other people as well)

Since this is a bigger task is not easy to just provide a complete answer of all you would have to do, it may help to see how those libraries are implemented and ask more questions either here or in the gitter channels.

But, I can give you a quick snippet of how it would look like:

// First define a pure interface of the operations you want to provide
trait PureModel[F[_]] { // You may forget about the abstract F and just use IO instead.
  def op1: F[Int]
  def op2(data: List[String]): F[Unit]
}

// Then in the companion object you define factories.
object PureModel {
  // If the underlying java object has a close or release action,
  // use a Resource[F, PureModel[F]] instead.
  def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] = ???
}

Now, how to create the implementation is the tricky part. Maybe you can use something like Sync to initialize the mutable state.

def apply[F[_]](args)(implicit F: Sync[F]): F[PureModel[F]] =
  F.delay(createMutableState()).map { mutableThing =>
    new PureModel[F] {
      override def op1: F[Int] = F.delay(mutableThing.foo())
      override def op2(data: List[String]): F[Unit] = F.delay(mutableThing.bar(data))
    }
  }