We all know that there are always multiple ways to solve a problem. I was wondering what the upsides and downsides of each of the particular solutions in one case would be. Time- and space-wise (and maybe cleanness - but that is subjective, so not the main focus).
You got a file, that also contains lines which include the string xyz
, but not exclusively. You are interested in lines where in a particular column the integer value should fullfil a condition.
An example where I used this would be to filter the weak ciphers from an sslscan output. This is not particular time nor space intensive, so this example was only to have a clearer picture of how that could look like.
The question came up, while I was trying to look for a solution, and I found various different answers on stackoverflow and then also came up with something myself.
Possible solution 1 (pure awk):
awk '$0~/xyz/ && $3 < 128 {$1=""; print}' file-with-data.txt
Possible solution 2 (awk + cut):
awk '$0~/xyz/ && $3 < 128' file-with-data.txt | cut -c15-
Possible solution 3 (bash):
grep xyz file-with-data.txt | while read -r line
do if [ $(echo $line | cut -d" " -f3) -le 127 ]
then echo $line
fi
done
A shell is an environment from which to call tools. It has certain programming language constructs to help you sequence the order in which you call tools. It was not created to, nor is it optimized in any way (e.g. language constructs) for, parsing text files.
Awk was created to parse text files. It's execution paradigm is based on that (built-in loop on input records) and it has specific constructs to help with that (e.g. BEGIN and END sections, variables NR, FNR, NF, etc.).
Any time you write a loop in shell to parse a text file you have the wrong approach and the shell loop you wrote, unlike the awk script, will fail cryptically given various input values, the contents of the directory you run it from, the OS you are on, etc...
IF you just need to find a string or RE in some text
THEN
use grep
ELIF you just need to select a single-char-separated field
THEN
use cut
ELIF you just need to do a simple subsitution for an RE on a single line
THEN
use sed
ELSE
use awk
ENDIF
wrt which of these approaches to choose from:
awk '$0~/xyz/ && $3 < 128 {$1=""; print}' file-with-data.txt
awk '$0~/xyz/ && $3 < 128' file-with-data.txt | cut -c15-
it doesn't matter. The second one has a bit of overhead but you'll never notice it so just pick the one that best fits your requirements (e.g. do really want to replace the first field with a blank or do you really want to cut N chars?) you and is easiest for you to write and understand. Personally I'd just stay in awk and use substr() if cut-ing is required.