So, what I'm trying to do is compile a single word list with no repeats out of 8 separate dictionary word lists. Some of the dictionaries have punctuation in them to separate the words. Below is what I have that pertains to the punctuation removal. I've tried several different solutions that I've found on stack overflow regarding regex expressions, as well as the one I've left in place in my code. For some reason, none of them are removing the punctuation from the source dictionaries. Can someone tell me what it is I've done wrong here and possibly how to fix it? I'm at a loss and had a coworker check it and he says this ought to be working as well.
int i = 1;
boolean checker = true;
Scanner inputWords;
PrintWriter writer = new PrintWriter(
"/home/htarbox/Desktop/fullDictionary.txt");
String comparison, punctReplacer;
ArrayList<String> compilation = new ArrayList<String>();
while (i <9)
{
inputWords = new Scanner(new File("/home/htarbox/Desktop/"+i+".txt"));
while(inputWords.hasNext())
{
punctReplacer = inputWords.next();
punctReplacer.replaceAll("[;.:\"()!?\\t\\n]", "");
punctReplacer.replaceAll(",", "");
punctReplacer.replaceAll("\u201C", "");
punctReplacer.replaceAll("\u201D", "");
punctReplacer.replaceAll("’", "'");
System.out.println(punctReplacer);
compilation.add(punctReplacer);
}
}
inputWords.close();
}
i = 0;
The line
punctReplacer.replaceAll(",", "");
returns a new String
with your replacement (which you're ignoring). It doesn't modify the existing String
. As such you need:
punctReplacer = punctReplacer.replaceAll(",", "");
Strings
are immutable. Once created you can't change them, and any String
manipulation method will return you a new String