Search code examples
regextranslationpo

Find all msgstr which are either not empty or multilines using regexp in a po file


In a .po file, what regexp could I use to find out all msgstr lines being either

msgstr "some text"

or which spread on several lines:

msgstr ""
"some multiline text"
"goes here"

but not empty ones:

msgstr ""

?

For the moment I'm using this:

(msgstr "[\D\s]+")|(msgstr ""[\n\D\s]+")

But it's not fully working.


Solution

  • If all following lines should start with a double quote with at least a single character:

    ^msgstr (?:"[^"\r\n]+"|.*(?:\n"[^\r\n"]+")+)
    

    The pattern matches:

    • ^ Start of string
    • msgstr Match literally
    • (?: Non capture group for 2 alternatives
      • "[^"\r\n]+" Match from an opening till closing double quote with at least one character other than a double quote or newline
      • | Or
      • .* Match the whole line
      • (?:\n"[^\r\n"]+")+ Repeat matching 1 or more lines with at least one character other than a double quote or newline
    • )Close the non capture group

    Regex demo

    If the following lines can also be empty:

    ^msgstr (?:"[^"\r\n]+"|.*(?:\n".*")+)
    

    Regex demo