Search code examples
phpgittimestampbranchworkflow

GIT post-receive hook for more branches - workflow or timestamp problem?


I have a development and production folder on the same server and 1 repo behind them to push to both folders depending on the branch that is pushed. I would like the development (test) folder to be deployed to when dev is pushed to the repo and the production folder (www) when master is pushed.

This works, but I have the problem of rewriting all files, not just those that have changed.

Example:

  • I am on local dev branch, make change only to file index.php, commit, push - Changes are deployed do development folder, with only index.php changed (new file timestamp).
  • Next step: git checkout master, git merge with dev branch and then
    push to master - Changes are properly deployed to production folder, but not only index.php is changed, but all files on FTP have new
    timestamp.

  • Next step: git checkout dev, make some changes to fileX.php, commit and push to dev - Changes are properly deployed to dev folder, but again all files on FTP have new timestamp, not only fileX.php

Is it normal or am I doing something wrong? Maybe can you recommend a better git workflow if I have 1 local PC with git - 1 development domain and 1 production domain ?

Thanks for help

post-receive hook:

#!/bin/sh
while read oldrev newrev ref
do
  branch=`echo $ref | cut -d/ -f3`
  if [ "master" == "$branch" ]; then
    git --work-tree=/home/www/domain.com/subdomains/www --git-dir=/home/www/domain.com/subdomains/repos/myrepo.git checkout master -f
    echo 'changes pushed to production'
  else
    git --work-tree=/home/www/domain.com/subdomains/test--git-dir=/home/www/domain.com/subdomains/repos/myrepo.git checkout dev -f
    echo 'changes pushed to dev'
  fi
done

Solution

  • This post-receive hook is badly flawed. The good news is that it's easily fixed.

    The root of your particular problem is that every Git repository, including the bare one1 that is receiving the push here, starts out with one (just one) index. The index keeps track of the (single, as in just one again) work-tree. This is true even though the whole point of a bare repository is to have no work-tree: it still has an index.

    When one uses git --work-tree=<path> with this bare repository to do a git checkout operation, that checkout operation uses the (one, single) index to keep track of what it has stored in the (temporarily added) work-tree. That is, for the duration of the git checkout, this bare repository becomes non-bare: it has one index and one work-tree, and the work-tree is the one chosen for this one git checkout operation.

    For each subsequent git --work-tree=<path> checkout operation, Git will assume that the index correctly describes the current checkout to the target work-tree. (Git may discover, as it does its work, that the index doesn't correctly describe the target work-tree, but it starts out assuming that it does.) The files that Git decides to update in that work-tree will be based on this assumption, with some corrections potentially, but not always, inserted as it goes. This main assumption, that the index correctly describes the current checkout to the work-tree, holds if and only if the path argument used in the subsequent git checkout is the same as the path argument used in the previous git checkout.

    But we have two different git checkout commands:

    git --work-tree=/home/www/domain.com/subdomains/www ...
    

    and:

    git --work-tree=/home/www/domain.com/subdomains/test
    

    (small aside here: something went wrong with the cut-and-paste above as the --git-dir option is fused with the --work-tree; I have made my own assumption here about fixing it).

    These are two different paths. They therefore need two different index files—or no index at all.2


    1A bare repository is a Git repository where core.bare is set to true. A bare repository has no work-tree and its Git database files are stored in the top level, rather than under a separate .git directory.

    2When there is no index, as in the initial git checkout, Git will build one as needed. This is more compute-intensive, and in a non-bare repository, where we actually use the index to build the next commit, destroying the index loses our carefully-built next commit: Git builds the new one from the current commit again so we have to start over.

    The work-tree itself is not affected by this rebuilding process, but if we can afford the space for one index file per work-tree—and index files are generally pretty small—it seems likely that we should do that. Still, it's an option, if you want it.


    Fixing the problem

    First, although it's not really related, let's fix the other obvious (but minor) problem:

    while read oldrev newrev ref
    do
      branch=`echo $ref | cut -d/ -f3`
    

    Suppose we add a tag named master/v1.1. This ref has, as its full spelling, the name refs/tags/master/v1.1. Let's see what we get here:

    $ ref=refs/tags/master/v1.1
    $ branch=`echo $ref | cut -d/ -f3`
    $ echo $branch
    master
    

    Whoops: our code will think we've updated branch master, when what we did is add the tag master/v1.1. We probably won't use such a tag, but why not do this correctly? Instead of using cut, let's check the whole reference:

    case $ref in
    refs/heads/*) branch=${ref#refs/heads/};;
    *) continue;; # not a branch
    esac
    

    Trying this out with our test ref we get:

    $ case $ref in
    > refs/heads/*) branch=${ref#refs/heads/};;
    > *) echo not a branch;;
    > esac
    not a branch
    

    Replacing ref with refs/heads/master we get:

    $ ref=refs/heads/master
    $ case $ref in
    refs/heads/*) branch=${ref#refs/heads/};;
    *) echo not a branch;;
    esac
    $ echo $branch
    master
    

    (There's no PS2 output in this section because I used control-P to redo the previous case.)

    For real correctness, we should also look at $old and $new to determine whether the ref in question is being created, updated, or deleted; but in this case we can just assume that branch names master and dev won't ever be deleted. So our new code reads:

    #!/bin/sh
    while read oldrev newrev ref
    do
        case $ref in
        refs/heads/*) branch=${ref#refs/heads/};;
        *) continue;; # not a branch - do not deploy
        esac
    
        case "$branch" in
        master) wt=/home/www/domain.com/subdomains/www loc=production;;
        dev) wt=home/www/domain.com/subdomains/test loc=dev;;
        *) continue;; # not an INTERESTING branch - do not deploy
        esac
    
        # It's an interesting branch; deploy it to $wt.  Use an index
        # based on the branch name.  There is no need to specify the
        # git directory, which is in $GIT_DIR right now because this is
        # a post-receive script.
        GIT_INDEX_FILE=index.$branch git --work-tree=$wt checkout $branch -f
        echo "deployed to $loc"
    done
    

    The -f option to git checkout here is somewhat dangerous: it will overwrite or remove files that got modified in the work-tree, even though we now have a proper index to keep track of which files should be in which state. But it was there before, so I've kept it.

    This post-receive script (which I should note is entirely untested: watch out for typos) still has a flaw. Whatever branch we git checkout here, that will leave this repository set to recommend that branch to whoever clones this repository. Fixing this requires updating HEAD:

    git symbolic-ref HEAD refs/heads/master
    

    after the git checkout step. You have been living with this flaw all along, so maybe you don't really care about this one.