I'm moderately new to common lisp, but have extended experience with other 'separate compilation' languages (think C/C++/FORTRAN and such)
I know how to do an ASDF system definition. I know how to separate stuff in packages. I'm using SBCL, by the way.
The question is this: what's the best practice for splitting code (large packages) between .lisp files? I mean, in C there are include files, while lisp lives with the current image state. So with multiple files I need to handle dependencies or serial order in the system definition. But without something like forward declarations it's painful.
Simple example on what I want to do: I have, for example, two defstructs that are part of the same bigger data structure (like struct1 is a parent of some set of struct2). Some functions works on one, some other works on the other and some other use both.
So I would have: a packages.lisp, a fun1.lisp (with the first defstruct and related functions), a fun2.lisp (with the other defstruct and functions) and a funmix.lisp (with functions that use both). In an ideal world everything is sealed and compiling these in this order would be fine. As most of you know, this in practice almost never happen.
If I need to use struct2 functions from the struct1 ones I would need to either reorder or add a dependency. But then if there's some kind of back call (that can't be done with a closure) I would have struct1.lisp depending on struct2.lisp and vice-versa which is obviously not valid. So what? I could break the loop putting the defstruct in a separate file (say, structs.lisp) but what if either of the struct's function need to access the common functions in the third file? I would like to avoid style notes.
What's the common way to solve this, i.e. keeping loosely related code in the same file but still be able to interface to other ones. Is the correct solution to seal everything in a compilation unit (a single file)? use a package for every file with exports?
Lisp dependencies are simple, because in many cases, a Lisp implementation doesn't need to process the definition of something in order to compile its use.
Some exceptions to the rule are:
Macros: macros must be loaded in order to be expanded. There is a compile-time dependency between a file which uses macro and the file which defines them.
Packages: a package foo
must be defined in order to use symbols like foo:bar
or foo::priv
. If foo
is defined by a defpackage
form in some foo.lisp
file, then that file has to be loaded (either in source or compiled form).
Constants: constants defined with defconstant
should be seen before their use. Similar remarks apply to inline functions, compiler macros.
Any custom things in a "domain specific language" which enforces definition before use. E.g. if Whizbang Inference Engine needs rules to be defined when uses of the rules are compiled, you have to arrange for that.
For certain diagnostics to be suppressed like calls to undefined functions, the defining and using files must be taken to be as a single compilation unit. (See below.)
All the above remarks also have implications for incremental recompilation. When there is dependency like the above between files so that one is a prerequisite of the other, when the prerequisite is touched, the dependent one must be recompiled.
How to split code into files is going to be influenced by all the usual things: cohesion, coupling and what have you. Common-Lisp-specific reasons to keep certain things together in one file is inlining. The call to a function which is in the same file as the caller may be inlined. If your program supports any in-service upgrade, the granularity of code loading is individual files. If some functions foo
and bar
should be independently redefinable, don't put them in the same file.
Now about compilation units. Suppose you have a file foo.lisp
which defines a function called foo
and bar.lisp
which calls (foo)
. If you just compile bar.lisp
, you will likely get a warning that an undefined function foo
has been called. You could compile foo.lisp
first and then load
it, and then compile bar.lisp
. But that will not work if there is a circular reference between the two: say foo.lisp
also calls (bar)
which bar.lisp
defines.
In Common Lisp, you can defer such warnings to the end of a compilation unit, and what defines a compilation unit isn't a single file, but a dynamic scope established by a macro called with-compilation-unit
. Simply put, if we do this:
(with-compilation-unit
(compile-file "foo.lisp") ;; contains (defun foo () (bar))
(compile-file "bar.lisp")) ;; contains (defun bar () (foo))
If a compile-file
isn't surrounded by with-compilation-unit
then there is a compilation unit spanning that file. Otherwise, the outermost nesting of the with-compilation-unit
macro determines the scope of what is in the compilation unit.
Warnings about undefined functions (and such) are deferred to the end of the compilation unit. So by putting foo.lisp
and bar.lisp
compilation into one unit, we suppress the warnings about either foo
or bar
not being defined and we can compile the two in any order.
Build systems use with-compilation-unit
under the hood, as appropriate.
The compilation unit isn't about dependencies but diagnostics. Above, we don't have a compile time dependency. If we touch foo.lisp
, bar.lisp
doesn't have to be recompiled or vice versa.
By and large, Lisp codebases don't have a lot of hard dependencies among the files. Incremental compilation often means that just the affected files that were changed have to be recompiled. The C or C++ problem that everything has to be rebuilt because a core header file was touched is essentially nonexistent.