Search code examples
regexawkposixcharacter-class

character class range in awk version 3.1.7


Unlike grep I am not able to define the size/range of digit character class in `awk. Any clue to right direction is appreciated.

cat input
1abc
12abc
123abc
1234abc
12345abc

In grep I can define the size/length of digit char class

grep -P '^\d{3,4}' input #or grep -P '^[[:digit:]]{3,4}' input
123abc
1234abc
12345abc
grep -P '^\d{4,}' input  #or grep -P '^[[:digit:]]{4,}' input 
1234abc
12345abc

Now I want to do this with awk,but the same regex is not working.

for example Following command does not give any output.

awk '/^[[:digit:]]{3,4}/' input 
awk '/^([[:digit:]]){3,4}/' input

I was expecting above command to print

123abc
1234abc
12345abc

Note 1: Currently I am using to define range but it is not sweet for big range.

awk '/^[0-9][0-9]?[0-9]?/' input

Note 2:

awk --version |head -1
GNU Awk 3.1.7

Solution

  • Use the --posix option.

    In the man pages of awk version 3, you can read:

    r{n,m}     One or two numbers inside braces denote an interval expression.  If there is one number in the braces, the preceding regu-
               lar  expression  r  is  repeated  n times.  If there are two numbers separated by a comma, r is repeated n to m times.  If
               there is one number followed by a comma, then r is repeated at least n times.
               Interval expressions are only available if either --posix or --re-interval is specified on the command line.