Search code examples
unixmergecattailunix-head

Remove lines from head and tail of multiple text files and merge


I have multiple text files with different data, but same header & bottom text. I have to remove the header and tail text and merge them into one output file. Any one liner with decent speed would be good. All of the file names Start with the name ABC, and are in the same directory.

Example File1:

This is a sample header
This is Not required
I have to remove this data

....... DATA of file 1 .........

This is sample tail 
It needs to be removed

Example File2:

This is a sample header
This is Not required
I have to remove this data

....... DATA of file 2 .........

This is sample tail 
It needs to be removed

I am using

head -n -12 ABC.txt | tail -n +20 > output.txt 

but it processes only 1 file. (12 lines to be removed from bottom, 20 to be removed from top)


Solution

  • Assuming all the files have a 20 line header, and 12 line footer, you can use sed to extract the 21st line through the 13th to last line:

    for file in ABC*; do
        numlines=$(cat $file | wc -l)
        lastline=$(( $numlines - 12 ))
        (( 21 <= $lastline )) && sed "21,$lastline \!D" $file >> combined.txt
    done
    

    Files that only have the header and footer, but no additional lines, produce no output. If you prefer to use your head and tail commands instead of sed:

    for file in ABC*; do
        numlines=$(cat $file | wc -l)
        (( 32 < $numlines )) && head -n -12 $file | tail -n +20 >> combined.txt
    done