I have multiple text files with different data, but same header & bottom text. I have to remove the header and tail text and merge them into one output file. Any one liner with decent speed would be good. All of the file names Start with the name ABC, and are in the same directory.
Example File1:
This is a sample header
This is Not required
I have to remove this data
....... DATA of file 1 .........
This is sample tail
It needs to be removed
Example File2:
This is a sample header
This is Not required
I have to remove this data
....... DATA of file 2 .........
This is sample tail
It needs to be removed
I am using
head -n -12 ABC.txt | tail -n +20 > output.txt
but it processes only 1 file. (12 lines to be removed from bottom, 20 to be removed from top)
Assuming all the files have a 20 line header, and 12 line footer, you can use sed to extract the 21st line through the 13th to last line:
for file in ABC*; do
numlines=$(cat $file | wc -l)
lastline=$(( $numlines - 12 ))
(( 21 <= $lastline )) && sed "21,$lastline \!D" $file >> combined.txt
done
Files that only have the header and footer, but no additional lines, produce no output.
If you prefer to use your head
and tail
commands instead of sed
:
for file in ABC*; do
numlines=$(cat $file | wc -l)
(( 32 < $numlines )) && head -n -12 $file | tail -n +20 >> combined.txt
done