I've been self-studying shell scripting for a while now, and I came across this section of a Linux Fundamentals manual concerning grep and curly braces {}. My problem is that when I'm demanding a string pattern to search for using grep from a minimum to a maximum number of occurrences using {} or curly braces, my result exceeds the maximum that I specified.
Here is what happened:
Express11:~/unix_training/reg_ex # cat reg_file2
ll
lol
lool
loool
loooose
Express11:~/unix_training/reg_ex # grep -E 'o{2,3}' reg_file2
lool
loool
loooose
Express11:~/unix_training/reg_ex #
When according to the manual, should not be the case as I am specifying here that I am only looking for strings containing two consecutive o's to three consecutive o's.
EDIT: Actually, the reason why I did not understand how the curly braces worked was because of this simplistic explanation by the manual. And I quote:
19.4.10. between n and m times And here we demand exactly from minimum 2 to maximum 3 times.
paul@debian7:~$ cat list2 ll lol lool loool paul@debian7:~$ grep -E 'o{2,3}' list2 lool loool paul@debian7:~$ grep 'o\{2,3\}' list2 lool loool paul@debian7:~$ cat list2 | sed 's/o\{2,3\}/A/' ll lol lAl lAl paul@debian7:~$
Thanks to all those who replied.
# grep -E 'o{2,3}' reg_file2
lool
loool
loooose
Command works perfectly, that it matches the first three o's in the last line. That's why you get also last line in the final output.
I think the command you're actually looking for is,
$ grep -P '(?<!o)o{2,3}(?!o)' file
lool
loool
Explanation:
(?<!o)
negative lookbehind which asserts that the match won't be preceded by the letter o
.
o{2,3}
Matches 2 or 3 o's.
(?!o)
Negative lookahead which asserts that the match won't be followed by the letter o
.
OR
$ grep -E '(^|[^o])o{2,3}($|[^o])' file
lool
loool
Explanation:
(^|[^o])
Matches the start of a line ^
or any character but not of o
o{2,3}
Matches 2 or 3 o's
($|[^o])
Matches the end of the line $
or any character but not of o