Search code examples
linuxgrepkleene-star

grep: When should the kleene star(*) match itself?


I am learning grep atm but I am having difficulty understanding the working of the kleene star metacharacter. The man pages describe that the * matches previous character zero or more times. I am using a file named test with the following content

*a
123ab
1234
abcdef
a?
?

grep 'a*' test should match zero or more occurrences of a and as explained prints every line of the test file in the output. The document further describes that to match metacharacters like * they have to be escaped by preceding them with a backslash \. But the output from grep '*' test and grep '\*' test is same. Output: *a Why is * matching itself without preceding it with \?


Solution

  • * on its own is an invalid regular expression since there is no previous item to repeat. Your implementation of grep, in this case, interprets it as a literal *. \* is a valid regular expression which matches a *. Your implementation's interpretation of the invalid regular expression * and the valid regular expression \* just happen to be the same.

    If you really want to see the difference between * and \*, you should try it on a valid regular expression by adding an item before it. For example, a literal a:

    grep 'a*'
    grep 'a\*'
    

    The former will match anything since * can match zero characters successfully. The latter will only match lines containing a* literally.