I want to use grep and a regular expression to search a text document. When I type in this:
grep -o ((D|d)ie|(D|d)as|(D|d)e(r|n|m|s)|(ei|Ei)(n|ne|nen|nem|ner|nes)) [A-ZÄÖÜ][A-Za-zäöü]* document.txt
I get this:
-bash: syntax error near unexpected token `('
I already tried to put the regular expression in quotation marks. By doing this, I don't get an error, but I don't find anything either. Thank you for helping me.
For example, the following sentence is in my document:
Der Mann und die Frau haben ein Haus.
I want to extract:
Der Mann
die Frau
ein Haus
Put the pattern in single quotes and enable Extended Regular Expression support with -E
.
grep -Eo '((D|d)ie|(D|d)as|(D|d)e(r|n|m|s)|(ei|Ei)(n|ne|nen|nem|ner|nes)) [A-ZÄÖÜ][A-Za-zäöü]*' document.txt
Bear in mind that (D|d)
can be written more simply in a bracket expression [Dd]
. The same applies for the other parts of your regular expression, where you are OR-ing single characters.
As mentioned in the comments, another option to consider is the -i
option, which means that the case of the characters is ignored entirely.