Search code examples
perlshellawkgrepwc

Find the longest sublist


I have a file containing list and sublist and I want to extract the longest sublist using command line tools.

File example:

* Item1
** SubItem1
** ...
** SubItemN

* Item2
** SubItem1
** ...
** SubItemN

* ...
** ...

* ItemN
** SubItem1
** ...
** SubItemN

I am trying to know if this can be done easily, otherwise I will write a Perl script.


Solution

  • The Perl one-liner:

    perl -00 -ne '$n=tr/\n/\n/; if ($n>$m) {$m=$n; $max=$_}; END {print $max}' file
    

    Just using bash:

    max=0
    while read bullet thingy; do
        case $bullet in
             "*") item=$thingy; count=0 ;;
            "**") ((count++)) ;;
              "") (( count > max )) && { max_item=$item; max=$count; } ;; 
        esac
    done < <(cat file; echo)
    echo $max_item $max
    

    The <(cat file; echo) part is to ensure that there is a blank line after the last line of the file, so that the last sublist group can be compared against the max

    That only keeps the count. To save the items in the biggest sublist:

    max=0
    while read bullet thingy; do
        case $bullet in
             "*") item=$thingy; unset sublist; sublist=() ;;
            "**") sublist+=($thingy) ;;
              "") if (( ${#sublist[@]} > max )); then
                      max=${#sublist[@]}
                      max_item=$item
                      max_sublist=("${sublist[@]}")
                  fi
                  ;;
        esac
    done < <(cat file; echo)
    printf "%s\n" "$max_item" "${#max_sublist[@]}" "${max_sublist[@]}"
    

    if using sudo_O's example, this outputs

    letters
    6
    a
    b
    b
    d
    e
    f