Search code examples
linuxbashubuntufile-iotail

Tail operation in directories that have similar files under


I am looking for a smart way to handle this case.

Under cpu-0 and fhcount directory, there are many files yet they can be grouped under some names.

Here is the bird's eye view of the directories and files.

../cpu-0/
        cpu-idle-01-01-2016
        cpu-idle-02-01-2016
        cpu-idle-03-01-2016
        .
        .
        cpu-interrupt-01-01-2016
        cpu-interrupt-02-01-2016
        cpu-interrupt-03-01-2016
        .
        .
        .
        cpu-nice-01-01-2016
        cpu-nice-02-01-2016
        .
        .
../fhcount/
        file_handles-max-01-01-2016
        file_handles-max-02-01-2016
        file_handles-max-03-01-2016
        .
        .
        file_handles-unused-01-01-2016
        file_handles-unused-02-01-2016
        file_handles-unused-03-01-2016
        .
        .
        .
        file_handles-used-01-01-2016
        file_handles-used-02-01-2016
        .
        .

As you can see, there is a pattern. I have collected them via hardcoding in order to tail the related files.

curdir="${PWD%}"

tail -q -n +2 $curdir/cpu-0/cpu-idle* > cpu-idle_combined
tail -q -n +2 $curdir/cpu-0/cpu-interrupt* > cpu-interrupt_combined
tail -q -n +2 $curdir/cpu-0/cpu-nice* > cpu-nice_combined

tail -q -n +2 $curdir/fhcount/file_handles-max* > file_handles-max_combined
tail -q -n +2 $curdir/fhcount/file_handles-unused-* > file_handles-unused_combined
tail -q -n +2 $curdir/fhcount/file_handles-used-* > file_handles-unused_combined

How could I do the same thing but this time smarter?


Solution

  • This goes through all files in subdirectories, collects the common part of filenames, then prints them into the combined output files:

    #!/bin/bash
    
    # Required for the +(pattern) glob
    shopt -s extglob
    
    # Associative array used as set of unique file name roots
    declare -A roots
    
    # Shorten names like cpu-0/cpu-idle-01-01-2016 to cpu-0/cpu-idle
    # +([[:digit:]-]) matches digits and hyphens
    # ${fname%%pattern) removes the longest match of pattern from the end of fname
    for fname in */*; do
        roots["${fname%%+([[:digit:]-])}"]=1
    done
    
    # Loop through unique roots, print to output files
    for fname in "${!roots[@]}"; do
        tail -q -n +2 "$fname"* > "$fname"_combined
    done
    

    Associative arrays require Bash 4.0 or newer.

    For an example input file structure of

    .
    ├── cpu-0
    │   ├── cpu-idle-01-01-2016
    │   ├── cpu-idle-02-01-2016
    │   ├── cpu-idle-03-01-2016
    │   ├── cpu-interrupt-01-01-2016
    │   ├── cpu-interrupt-02-01-2016
    │   ├── cpu-interrupt-03-01-2016
    │   ├── cpu-nice-01-01-2016
    │   ├── cpu-nice-02-01-2016
    │   └── cpu-nice-03-01-2016
    └── fhcount
        ├── file_handles-max-01-01-2016
        ├── file_handles-max-02-01-2016
        ├── file_handles-max-03-01-2016
        ├── file_handles-unused-01-01-2016
        ├── file_handles-unused-02-01-2016
        ├── file_handles-unused-03-01-2016
        ├── file_handles-used-01-01-2016
        ├── file_handles-used-02-01-2016
        └── file_handles-used-03-01-2016
    

    the resulting output structure is

    .
    ├── cpu-0
    │   ├── cpu-idle-01-01-2016
    │   ├── cpu-idle-02-01-2016
    │   ├── cpu-idle-03-01-2016
    │   ├── cpu-idle_combined
    │   ├── cpu-interrupt-01-01-2016
    │   ├── cpu-interrupt-02-01-2016
    │   ├── cpu-interrupt-03-01-2016
    │   ├── cpu-interrupt_combined
    │   ├── cpu-nice-01-01-2016
    │   ├── cpu-nice-02-01-2016
    │   ├── cpu-nice-03-01-2016
    │   └── cpu-nice_combined
    └── fhcount
        ├── file_handles-max-01-01-2016
        ├── file_handles-max-02-01-2016
        ├── file_handles-max-03-01-2016
        ├── file_handles-max_combined
        ├── file_handles-unused-01-01-2016
        ├── file_handles-unused-02-01-2016
        ├── file_handles-unused-03-01-2016
        ├── file_handles-unused_combined
        ├── file_handles-used-01-01-2016
        ├── file_handles-used-02-01-2016
        ├── file_handles-used-03-01-2016
        └── file_handles-used_combined
    

    and for example input file contents like

    $ head cpu-idle*
    ==> cpu-idle-01-01-2016 <==
    1cpu-idle-01-01-2016
    2cpu-idle-01-01-2016
    3cpu-idle-01-01-2016
    
    ==> cpu-idle-02-01-2016 <==
    1cpu-idle-02-01-2016
    2cpu-idle-02-01-2016
    3cpu-idle-02-01-2016
    
    ==> cpu-idle-03-01-2016 <==
    1cpu-idle-03-01-2016
    2cpu-idle-03-01-2016
    3cpu-idle-03-01-2016
    

    the combined output files contain

    $ cat cpu-idle_combined
    2cpu-idle-01-01-2016
    3cpu-idle-01-01-2016
    2cpu-idle-02-01-2016
    3cpu-idle-02-01-2016
    2cpu-idle-03-01-2016
    3cpu-idle-03-01-2016