Search code examples
regexgrepword-boundarycharacter-class

Hyphen/Dash to be included in regex word boundary \b


Simply put:

echo "xxxxx Tyyy zzzzz" | egrep "\byyy\b" 

(no match which is correct)

echo "xxxxx T-yyy zzzzz" | egrep "\byyy\b" 
xxxxx T-yyy zzzzz

I dont want it to match like it does in the second expression, please advise how I can achieve this, thanks.


Solution

  • You can use:

    echo "xxxxx T-yyy zzzzz" | grep -E "(^|[^-])\byyy\b([^-]|$)"
    

    Where (^|[^-])\byyy\b([^-]|$) will match start or non-hyphen on LHS and end or non-hyphen on RHS of the matched word yyy.