Search code examples
regexpcrepcregrep

PCRE regex whitespace


I am trying to write a regex to pull all non-comment and non-empty lines from /etc/samba/smb.conf. Comments are lines that:

  1. start with #
  2. start with ;
  3. start with any amount of whitespace followed immediately by either # or ;

I tried the following, but it did not properly handle comment type 3.

grep -P '^\s*[^#;]' /etc/samba/smb.conf

This one worked for all 3 types of comments:

grep -P '^\s*[^#;\s]' /etc/samba/smb.conf

Can you explain why adding \s to the character class successfully filtered out comment type 3?


Solution

  • The problem here is partial matches as you have not used an end anchor $.

    In case of example 3

          ;
    

    There will be partial matching upto ; done by \s*.In the other regex you have disabled \s so it will not capture the space and partial match is disabled.

    The correct regex here is

     (?m)^(?!\s*[#;]).+$
    

    See demo