I have a text file with the following contents:
**gvožđa gvozda gvozdja
гвожђа
It’s four words, but each means one thing: iron.
The "d", "dj", "đ", "ђ" are four letters indicating a one "phone".
I am using the following grep formula to search for these three words:
grep '\s*[gг][vв]o[žжz](dj|[dđђ])a\s*' filename
This grep command gives no output at all. Why? It should gives all these words in the file:
gvožđa
gvozda
gvozdja
гвожђа
The problem occurs due to the fact that your pattern does not match Cyrillic о
and а
, and because you use a POSIX ERE pattern without the -E
option.
You can use
grep -Eo '[gг][vв][oо][žжz](dj|[dđђ])[aа]' filename
Using \s*
does not actually make sense as it only matches zero or more whitespace chars (only in GNU grep
).
I added -o
option here to output all matches, not just matched lines.
See the online grep demo.