Search code examples
regexpcrequotation-marks

Regex to match space after opening quotation mark


I have written a regex to match sentences with quotation marks on both sides in a single line:
(?<!")"([^"\r]+)"(?!")

Input Text:
The sign said, "Walk." Then it said, "Don't Walk", then, "Walk", all within thirty seconds. He yelled, "Hurry up."

Match 1: "Walk."
Match 2: "Don't Walk"
Match 3: "Walk"
Match 4: "Hurry up."

Now, I want to have only matches which include a single space after opening quotation mark.

I tried to add (\ {1}) inside the regex after the first quotation. Now it looks like:
(?<!")"((\ {1})[^"\r]+)"(?!")

My new match is:
Match 1: " Then it said, "

But I expect no matches because there is no single space after quotation in any of my earlier 4 matches.

Now the whole thing is messed up because it ignores the initial structure and matches quotations independently which results in looking spaces even after closing quotation.

Any idea how to resolve this?

Thanks


Solution

  • The problem is that the double quote is your start and close delim char.

    Use PCRE regex:

    (?<!")"(?!\ )([^"\r]+)"(?!")(*SKIP)(*F)|(?<!")"\ ([^"\r]+)"(?!")
    

    See proof. (?<!")"(?!\ )([^"\r]+)"(?!")(*SKIP)(*F) will match double quoted strings that does not have a space after the initial ", and will skip these matches. (?<!")"\ ([^"\r]+)"(?!") will fetch you the expected matches.