Search code examples
filecvs

File churn in CVS


I'm looking to find the number of times that each file has changed on a particular branch in our cvs repository. I'm particularly looking for all the files which have changed the most. A "top 40" list would be good enough.


Solution

  • This was added as an edit by the original asker, I have converted it to a community wiki answer because it should be an answer, not an edit.

    In this case, the branch has been in use for about 6 months. If I set to the latest in that branch ("cvs -z9 co -r r80m-1 ..."), it looks like the last number of the revision is the number of changes in the current branch -- if the file has been changed in the past 180 days, then it's on this branch. I'm using linux, so I eventually did it this way:

    for file in `find . \! \( -name CVS -prune \) -type f -mtime -180`
    do
       cvs status "$file" | grep Working.revision | gawk -v FNAME=$file '{ print FNAME gensub(/(\.)([0-9]*)$/, "\\1\\2 churn:\\2  ", 1) }' >> cvs_churn.txt
    done
    sort -k3 -t: -n cvs_churn.txt | uniq
    

    So, for each line in "cvs status" output like:

    Working revision: 1.2.34
    

    The gawk command changes it to:

    ./path/file.c Working revision: 1.2.34        churn:34
    

    and I can then sort on ":34".


    This works, but it's pretty crude. I'm hoping others may be able to answer with better approaches.

    I've seen in some other questions eg: Free CVS reporting tools people have mentioned statCVS. It sounds interesting (more than I need, but some of the other info might also be useful). However, it says it only works on the "default" branch. The documentation was a little unclear -- can I set to the branch of interest, and use it for this?