Search code examples
haskellshake-build-system

Recover the source file name in a Shake rule


I am writing a build system for a static website which works like this:

  • for every file src/123-some-title.txt produce a file out/123.html

My problem is that when writing the rule for out/*.html I have no direct way to recover the source file name (src/123-some-title.txt) from the target file name (out/123.html).

Of course I could read the src/ directory again and search for a file that starts with 123, but is there a nicer way to do this with Shake?


Solution

  • The first thing to mention is that if you call getDirectoryFiles multiple times with the same arguments it will only calculate once, in the same way that if you call need multiple times on the same file it will only build once. One approach would be:

    "out/*.fwd" *> \out -> do
        res <- getDirectoryFiles "src" ["*.txt"]
        let match = [(takeBaseName out ++ "-") `isPrefixOf` takeBaseName x  | x <- res]
        when (length match /= 1) $ error "fail, because wrong number of matches"
        writeFileChanged out $ head match
    
    "out/*.html" *> \out -> do
        src <- readFile' (out -<.> "fwd")
        txt <- readFile' ("src" </> src)
        ...
    

    Here the idea is that the file out/123.txt contain the contents 123-some-title.txt. By using writeFileChanged we only change the .fwd file when the relevant part of the directory changes.

    If you want to avoid the .fwd files, you can use the Oracle mechanism. If you want to avoid a linear scan of the getDirectoryFiles result you can use the newCache function. In practice, neither is likely to be problematic, and going with the files is probably simplest.