I have a file with almost 5*(10^6) lines of integer numbers. So, my file is big enough.
The question is all about extract specific lines, filtering them by a condition. For example, I'd like to:
condition
related a number (math predicate)Is there a cleaver way to perform these tasks? (using sed
or awk
or cat
or head
)
Thanks in advance.
To extract the first $NUMBER
lines,
head -n $NUMBER filename
Assuming every line contains just a number (although it will also work if the first token is one), 2 can be solved like this:
awk '$1 >= 1234 && $1 < 5678' filename
And keeping in spirit with that, 3 is just the extension
awk 'condition' filename
It would have helped if you had specified what condition
is supposed to be, though. This way, you'll have to read the awk documentation to find out how to code it. Again, the number will be represented by $1
.
I don't think I can explain anything about the head
call, it's really just what it says on the tin. As for the awk
lines: awk
, like sed
, works linewise. awk
fetches lines in a loop and applies your code to each line. This code takes the form
condition1 { action1 }
condition2 { action2 }
# and so forth
For every line awk fetches, the conditions are checked in the order they appear, and the associated action to each condition is performed if the condition is true. It would, for example, have been possible to extract the first $NUMBER
lines of a file with awk like this:
awk -v number="$NUMBER" '1 { print } NR == number { exit }' filename
where 1
is synonymous with true (like in C) and NR
is the line number. The -v
command line option initializes the awk variable number
to $NUMBER
. If no action is specified, the default action is { print }
, which prints the whole line. So
awk 'condition' filename
is shorthand for
awk 'condition { print }' filename
...which prints every line where the condition holds.