Search code examples
shellcloc

CLOC ignore/exclude list file (.clocignore)


(Edit: see Proper Usage section on the bottom.)

Main Question

How do you get cloc to use its --exclude-list-file=<file> option? Essentially, I'm trying to feed it a .clocignore file.

Expected Behavior

cloc documentation says the following:

--exclude-list-file=<file>  Ignore files and/or directories whose names
                          appear in <file>.  <file> should have one entry
                          per line.  Relative path names will be resolved
                          starting from the directory where cloc is
                          invoked.  See also --list-file.

Attempts

The following command works as expected:

cloc --exclude-dir=node_modules .

But this command doesn't exclude anything:

cloc --exclude-list-file=myignorefile .

This is the contents of myignorefile:

node_modules
node_modules/
node_modules/*
node_modules/**
./node_modules
./node_modules/
./node_modules/*
./node_modules/**
/full/path/to/current/directory/node_modules
/full/path/to/current/directory/node_modules/
/full/path/to/current/directory/node_modules/*
/full/path/to/current/directory/node_modules/**

cloc does not error if myignorefile doesn't exist, so I have no feedback on what it's doing.

(I'm running OS X and installed cloc v1.60 via Homebrew.)



Proper Usage

tl;dr -- The method specified in @Raman's answer both requires less to be specified in .clocignore and runs considerably faster.


Spurred on by @Raman's answer, I investigated the source code: cloc does in fact respect --exclude-list-file but processes it differently than --exclude-dir in two important ways.

Exact filename versus 'part of the path'

First, while --exclude-dir will ignore any files whose paths contain the specified strings, --exclude-list-file will only exclude the exact files or directories specified in .clocignore.

If you have a directory structure like this:

.clocignore
node_modules/foo/first.js
app/node_modules/bar/second.js

And the contents of .clocignore is just

node_modules

Then cloc --exclude-list-file=.clocignore . will successfully ignore first.js but count second.js. Whereas cloc --exclude-dir=node_modules . will ignore both.

To deal with this, .clocignore needs to contain this:

node_modules
app/node_modules

Performance

Second, the source code for cloc appears to add the directories specified in --exlude-dir to a list which is consulted before counting the files. Whereas the list of directories discovered by --exclude-list-file is consulted after counting the files.

Meaning, --exclude-list-file still processes the files, which can be slow, before ignoring their results in the final report. This is borne out by experiment: in an example codebase, it took half a second to run cloc with --exclude-dir, and 11 seconds to run with an equivalent --exclude-list-file.


Solution

  • The best workaround I've found is to feed the contents of .clocignore directly to --exclude-dir. For example, if you are using bash and have tr available:

    cloc --exclude-dir=$(tr '\n' ',' < .clocignore) .