Search code examples
version-controlrefactoringclearcase

Finding most commonly edited files in clearcase


We are currently planning a quality improvement exercise and i would like to target the most commonly edited files in our clearcase vobs. Since we have just been through a bug fixing phase the most commonly edited files should give a good indication of where the most bug prone code is, and therefore the most in need of quality improvment.

Does anyone know if there is a way of obtaining a top 100 list of most edited files? Preferably this would cover edits that are happening on multiple branches.


Solution

  • (The previous answer was for a simpler case: single branch)

    Since "most projects dev has not all happened on the one branch so the version numbers don't necessarily mean most edited", a "way to get number of check-ins across all branches" would be:

    • search all versions created since the date of the last bug fixing phase,
    • sort them by file,
    • then by occurrence.

    Something along the lines of:

    C:\Prog\cc\test\test>ct find -all -type f -ver "created_since(16-Oct-2009)" -exec "cleartool descr -fmt """%En~%Sn\n""""""%CLEARCASE_XPN%"""" | grep -v "\\0" | awk -F ~ "{print $1}" | sort | uniq -c | sort /R | head -100
    

    Or, for Unix syntax:

    $ ct find -all -type f -ver 'created_since(16-Oct-2009)' -exec 'cleartool descr -fmt "%En~%Sn\n" "%CLEARCASE_XPN%"' | grep -v "/0"  | awk -F ~ '{print $1}' | sort | uniq -c | sort -rn | head -100
    
    • replace the date by the one of the label marking the start of your bug-fixing phase
    • Again, note the double-quotes around the '%CLEARCASE_XPN%' to accommodate spaces within file names.
    • Here, '%CLEARCASE_XPN%' is used rather than '%CLEARCASE_PN%' because we need every versions.
    • grep -v "/0" is here to exclude version 0 (/main/0, /main/myBranch/0, ...)
    • awk -F ~ "{print $1}" is used to only print the first part of each line:
      C:\Prog\cc\test\test\a.txt~\main\mybranch\2 becomes C:\Prog\cc\test\test\a.txt
    • From there, the counting and sorting can begin:
      • sort to make sure every identical line is grouped
      • uniq -c to remove duplicate lines and precede each remaining line with a count of said duplicates
      • sort -rn (or sort /R for Windows) for having the most edited files at the top
      • head -100 for keeping only the 100 most edited files.

    Again, GnuWin32 will come in handy for the Windows version of the one-liner.