i want to take out certain range of huge numeric values, 720000002774991000 to 720000002774991099. so i tried the below command,
awk -F, ' { if (($1 >= 720000002774991000) && ($1 <= 720000002774991099)) print $0} ' VOUCHER_DUMP_REPORT.csv | head
VOUCHER_DUMP_REPORT.csv is my input file and has only one column of that huge numbers.
but the output i am getting is not accurate, it has some values other than the range i gave.
output :
720000002774991065
720000002774991082
720000002774990985
720000002774991131
720000002774990919
720000002774991110
720000002774990947
720000002774991070
720000002774991042
720000002774991044
Looks like your numbers are too long to be correctly represented as integers.
You have a couple of possible solutions. With GNU awk, you can use -M
to enable support for arbitrary precision integers:
awk -M '$1 >= 720000002774991000 && $1 <= 720000002774991099' file
Otherwise, if you're sure that the first column only contains numbers, you can use a string comparison:
awk -F, -v min=720000002774991000 -v max=720000002774991099 '
BEGIN { lmin = length(min); lmax = length(max) } # save length of min and max
"" $1 < min || "" $1 > max { next } # skip lines which fail string comparison
{ l1 = length($1) } # calculate length of field
l1 >= lmin && l1 <= lmax # check that string length is correct
' file
"" $1
concatenates an empty string with the contents of the first field, which forces awk to treat it as a string rather than a number. Without this, the comparison would be numerical, rather than lexical, and you would have the same problem as in your original attempt.
A probably less efficient but possibly easier to understand version that uses string comparison would be:
awk -F, -v min=720000002774991000 -v max=720000002774991099 '
"" $1 >= min && "" $1 <= max \
&& length($1) >= length(min) && length($1) <= length(max)' file
As in the previous version, lines are printed that pass both the string comparison and the length comparison. The downside to this approach is that the length of min
, max
and $1
are calculated more times than necessary.
$ cat file
720000002774991065
720000002774991082
720000002774990985
720000002774991131
720000002774990919
720000002774991110
720000002774990947
720000002774991070
720000002774991042
720000002774991044
$ awk -M '$1 >= 720000002774991000 && $1 <= 720000002774991099' file
720000002774991065
720000002774991082
720000002774991070
720000002774991042
720000002774991044
$ awk -F, -v min=720000002774991000 -v max=720000002774991099 '
BEGIN { lmin = length(min); lmax = length(max) } # save length of min and max
"" $1 < min || "" $1 > max { next } # skip lines which fail string comparison
{ l1 = length($1) } # calculate length of field
l1 >= lmin && l1 <= lmax # check that string length is correct
' file
720000002774991065
720000002774991082
720000002774991070
720000002774991042
720000002774991044
$ awk -F, -v min=720000002774991000 -v max=720000002774991099 '
"" $1 >= min && "" $1 <= max \
&& length($1) >= length(min) && length($1) <= length(max)' file
720000002774991065
720000002774991082
720000002774991070
720000002774991042
720000002774991044