Is there a regex to the String.replaceAll method that only keeps letters and white spaces

I have made a program that counts the frequency of a word in a very long string. My problem is that the program is counting for example "*it" (consider * a quotation mark) and "it" as different words and therefore putting them in different categories.

I tried to replace all the punctuation marks I know of with the following code:

text = text.replace("\n", " ");
text = text.replaceAll("\\p{Punct}", " ");
text = text.replace("\"", "");
text = text.replace("–", "");
text = text.replace("\t", "");

Unfortunately, the code didn't work and I think it is because there is a lot of different quotation marks in Unicode that I can't see a difference between, so is there a way to remove all Unicode characters except letters and whitespaces with the String.replaceAll method or do I have to make a CharArray and continue from there?

Thanks a lot, any help would be appreciated.

Solution

I think this might do it

text = text.replaceAll("[^a-zA-Z0-9 ]", "");

which will remove all the characters which are not either alphanumeric or special characters.

EDIT :-

As suggesed by @npinti

text = text.replaceAll("[^\\p{L}0-9 ]", "");