Search code examples
bashunixsedawkwc

Bash: Find file with max lines count


This is my try to do it

  • Find all *.java files
    find . -name '*.java'
  • Count lines
    wc -l
  • Delete last line
    sed '$d'
  • Use AWK to find max lines-count in wc output
    awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

then merge it to single line

find . -name '*.java' | xargs wc -l | sed '$d' | awk 'max=="" || data=="" || $1 > max {max=$1 ; data=$2} END{ print max " " data}'

Can I somehow implement counting just non-blank lines?


Solution

  • find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; | \
        sort -nr -t":" -k2 | awk -F: '{print $1; exit;}'
    

    Replace the awk command with head -n1 if you also want to see the number of non-blank lines.


    Breakdown of the command:

    find . -type f -name "*.java" -exec grep -H -c '[^[:space:]]' {} \; 
    '---------------------------'       '-----------------------'
                 |                                   |
       for each *.java file             Use grep to count non-empty lines
                                       -H includes filenames in the output
                                     (output = ./full/path/to/file.java:count)
    
    | sort -nr -t":" -k2  | awk -F: '{print $1; exit;}'
      '----------------'    '-------------------------'
              |                            |
      Sort the output in         Print filename of the first entry (largest count)
    reverse order using the         then exit immediately
      second column (count)