I'm formatting a very large amount of plaintext files using java, and I need to remove all punctuation except for apostrophes. When I originally had set up the regex for the replaceAll
statement, it worked to get rid of everything that I knew of, except now I've found one particular file/punctuation set that it's not working in.
holdMe = holdMe.replaceAll("[,_\"-.!?:;)(}{]", " ");
I know I'm hitting this statement because all of the other punctuation clears, there's no periods, commas, etcetera. I've tried escaping out the () and {} characters, but it still doesn't get replaced on those characters. I've been trying to teach myself regex using the Oracle documentation, but I can't seem to understand why this isn't working.
This regex will mark every punctuation except Apostrophes
[\p{P}&&[^\u0027]]
The java-string of the regex:
"[\\p{P}&&[^\u0027]]"