Search code examples
svntortoisesvn

Reject commit of PDF files for Tortoise (Windows)


In our repository, a user has accidentally committed several PDF's and now the size of the repo is quite big. And it's not useful to have this sort of binary files there.

I've tried with Settings/Global ignore pattern:

*.o *.lo *.la *.al .libs *.so *.so.[0-9]* *.a *.pyc *.pyo __pycache__ *.rej *~ #*# .#* .*.swp .DS_Store [Tt]humbs.db *.pdf

But I was still able to add a PDF file and to commit it successfully.

I expected that the commit is blocked because of the setting. Is there another way to block PDF's for the repository?


Solution

  • Why ignore patterns don't work

    The global ignore pattern has no effect because you can still explicitly add files. This is described in the Red Bean Book, Chapter on Ignoring Files. Global and per-directory ignores are intended to clear the view from artifacts, swap files, etc.

    Introducing the pre-commit hook

    To prevent certain files to be committed to the repository, Subversion offers hook scripts. These are run on the repository server when a certain event happens. To prevent a commit, use the pre-commit hook.

    Read the hook templates that svnadmin puts in any new repository. Answers to the following, related questions may also help: How do I implement an SVN hook to know the filename of the file committed, etc.? and SVN (server - pre-commit hook): Know the list of files that are being committed

    Hooks can be any kind of executables, usually shell scripts or batch files. It's possible to call any other script or program from a hook so there's hardly a limit to what one can do.

    Depending on whether the repository is hosted on a Windows or *NIX box and what your (scripting) language of choice is, the script will look very different. I can give the outline and some pointers. Feel free to post another answer with a concrete implementation.

    The algorithm to implement is:

    • Get commit transaction
    • Use svnlook on the transaction to get file names (e.g. *.pdf) or the if it's a binary file
    • If offending, write a message to stderr and return a non-zero exit code

    For the sake of completeness, let's mention that TortoiseSVN offers client side commit hooks. I hardly think that's a solution since it seems to offer the worst of both worlds.

    The shortcomings of a purely technical solution

    Problem solved? Hardly. Remember the last time your mail client told you it was "insecure" to send foo.exe by e-mail? Of course you immediately renamed that sucker to foo.exe.txt and sent it anyway!

    That's what your users will do. The same time you create an official policy "you must not commit PDF files" you create an inofficial policy "rename PDF files before committing them".

    Not convinced? Assume you want to integrate a third-party library. You get the binary lib file and the API documentation as PDF. Better to have all of this in one place, in the correct version ... in your repository. Only you can't because you outlawed binary files and/or PDF files.

    One last thought ...

    Whether or not you choose to implement a pre-commit hook, you should communicate the reasons to your users. Without knowing why committing binary files is bad (and is it really?) any kind of restriction will just seem like an obstacle thrown in their way to keep them from doing their work. Make them understand and you won't need a hook script.

    The one sentence that sticks out in the question is that somebody "accidentally committed" something. Nobody should accept that things happen by accident.

    You should investigate why those files were committed. I can immediately think of three reasons:

    1. Sloppiness. It's a matter of professional conduct to carefully review your own changes before committing them. Having all changes peer reviewed is even better and guards against many more pitfalls.

    2. The user was not familiar with TortoiseSVN (not surprisingly, everybody is using Git these days). Some training on how to use the tools might be a good investment.

    3. The user thought it was the right thing to do. Build some common understanding on what you want have in version control and what not.