I have some OCRed documents where the commas have been recognized as full stops in some places.
Like this:
staring thoughtfully into his empty coffee cup. and he absently
How do I find these instances in the document and replace them without having to find every '. ' manually?
I can't get my head around the different expressions.
I do know I can use [a-z]\.(.)[A-Z]
to find and mark 'p. a' in this example but it also marks 'p. A'.
I only want to change the 'p. a' in these instances to 'p, a'.
Is this possible?
The (.)
part captures any char except a newline in a group, and [A-Z]
matches an uppercase char.
In this case you don't need the group, and you can match a lowercase char a-z followed by a dot and assert spaces to the right followed by a lowercase char again.
To not remove the already matched lowercase char, you can use \K
to clear what is matched so far. In the the replacement use a comma.
-> Enable "Regular expression"
-> Check "Match case"
-> Check "Wrap around"
Find what:
[a-z]\K\.(?=\h+[a-z])
Replace with:
,
See a regex demo