I am reading stop words from a file, which I am saving in a HashSet
. I compare said HashSet
with a String
to check for stop words.
If I put a single stop word, such as "the", in the String
-variable, my output is "Yes". However, if I put something like "Apple is it" or "it is an apple", the output is "No", despite the fact that both String
-variables contain stop words.
Here's the whole program, containing two methods, one for reading the file and one for removing the stop words:
private static HashSet<String> readFile(){
Scanner x = null;
HashSet<String> hset = new HashSet<String>();
try {
x = new Scanner(new File("StopWordsEnglish"));
while(x.hasNext()){
hset.add(x.next());
}
} catch(Exception e) {
e.printStackTrace();
} finally {
x.close();
}
return hset;
}
public static void removeStopWords(){
HashSet<String> hset = readFile();
System.out.println(hset.size());
System.out.println("Enter a word to search for: ");
String search = "is";
String s = search.toLowerCase();
System.out.println(s);
if (hset.contains(s)) {
System.out.println("Yes");
} else {
System.out.println("No");
}
}
I have a feeling I'm not reading your question correctly. But here goes.
Assuming:
String search = "it is an apple";
Then you should probably split the string and check each word individually.
String[] split = search.split(" ");
for (String s : split) {
if (hset.contains(s.toLowerCase()) {
System.out.println("Yes");
break; //no need to continue if a stop word is found
} else {
System.out.println("No");
}