I want to filter a text, leaving only letters (a-z and A-Z). It seemed to be easy, following something like this How to filter a Java String to get only alphabet characters?
String cleanedText = text.toString().toLowerCase().replaceAll("[^a-zA-Z]", "");
System.out.println(cleanedText);
The problem that the output of this is empty, unless I change the regex, adding another character, e.g. :
--> [^:a-zA-Z]
I allready tried to check if it works with normal regex (not using the method ReplaceAll given by String object in Java), but I had exactly the same problem.
Any idea what could be the source of this strange behavior?
I had a txt file which I read using a BufferedReader. I add each line to one long string and apply the code I posted before to this. The whole code is as follows:
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.lang.StringBuffer;
import java.util.regex.*;
public class Loader {
public static void main(String[] args) {
BufferedReader file = null;
StringBuffer text = new StringBuffer();
String str;
try {
file = new BufferedReader(new FileReader("text.txt"));
} catch (FileNotFoundException ex) {
}
try
{
while ((str = file.readLine()) != null) {
text.append(str);
}
String cleanedText = text.toString().toLowerCase().replaceAll("[^:a-z]", "");
System.out.println(cleanedText);
} catch (IOException ex) {
}
}
}
The text file is a normal article where I want to delete everything (including whitespaces) that is not a letter. An extract is as follows "[16]The Free Software Foundation (FSF), started in 1985, intended the word "free" to mean freedom to distribute"
In the end the problem was not with the regex nor with the program itself. It was just that eclipse does not show the output in console if it exceeds a certain length (but you can still work on it). To solve this simply check the fixed width console in Window -> Preferences -> Run/Debug -> Console as described in http://code2care.org/2015/how-to-word-wrap-eclipse-console-logs-width/