Search code examples
scalamodular

"Modular" Scala guidelines


Recently I've become confused how to organize my Scala code, because there are a lot of options.

Are there any guidelines for Scala how/when to use packages, objects, package objects for organizing code?


Solution

  • Understanding Scala's capabilities

    First, we need to understand the capabilities and limitations of each modularization strategy.

    Packages

    These work just like in Java. You can use many files to declare different parts of one package, and you can nest many levels deep. This provides maximum flexibility with your layout. However, since the default classloaders expect to only find classes and interfaces in packages, that's all Scala lets you put there. (Classes, traits, and objects.)

    Objects

    Objects can contain anything--methods, fields, other objects, classes, traits, etc.. Subclasses, traits, and objects are actually their own separate entities with the containing object as a name-mangled prefix (as far as the JVM is concerned). An object must be contained wholly within one file, and although you can nest subclasses arbitrarily deep, it's done via mangling increasingly long names, not adding to the path for the classloader.

    Package objects

    The problem with only having objects and packages is that you might want a nested structure:

    scala.xml
    scala.xml.include
    scala.xml.include.sax
    

    so that you need to use packages (to avoid having one gigantic file and disturbingly long class names). But you also might want

    import scala.xml._
    

    to make various constants and implicit conversions available to you, so that you need to use an object. Package objects come to the rescue; they are essentially the same as ordinary objects, but when you say

    import scala.xml._
    

    you get both everything in the package (scala.xml._) but also everything in the corresponding package object (scala.xml.package).

    How to modularize your code

    Now that we know how each part works, there are fairly obvious rules for how to organize:

    • Place related code into a package
    • If there are many related sub-parts, place those into sub-packages
    • If a package requires implicits or constants, put those into the package object for that package
    • If you have a terminal branch of your package hierarchy, it is your choice as to whether it should be an object or a package object. There are a few things that package objects are not allowed to do (though the list is getting smaller all the time--I'm not sure there's anything left except a prohibition against shadowing other names in the package), so a regular object might be a better choice. As long as you're not worried about binary compatibility, it's easy to change your mind later--just change object to package object in most cases.