I have a large text file with phrases such as:
citybred JJ
Brestowe NNP
STARS NNP NNS
negative JJ NN
investors NNS NNPS
mountain NN
My objective is to keep the first word of each line, without the spaces, and also make them lowercase. EX:
citybred
brestowe
stars
negative
investors
mountain
Would be returned if the above text was evaluated.
Any help?
Current code:
public class FileLinkList
{
public static void main(String args[])throws IOException{
String content = new String();
File file = new File("abc.txt");
LinkedList<String> list = new LinkedList<String>();
try {
Scanner sc = new Scanner(new FileInputStream(file));
while (sc.hasNextLine()){
content = sc.nextLine();
list.add(content);
}
sc.close();
} catch(FileNotFoundException fnf){
fnf.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
System.out.println("\nProgram terminated Safely...");
}
Collections.reverse(list);
Iterator i = list.iterator();
while (i.hasNext()) {
System.out.print("Node " + (count++) + " : ");
System.out.println(i.next());
}
}
}
If your token and its POS tag is separated by space :
public class FileLinkList{
public static void main(String[] args) {
BufferedReader br = null;
LinkedList<String> list = new LinkedList<String>();
String word;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader("LEXICON.txt"));
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
word = sCurrentLine.trim().split(" ")[0];
list.add(word.toLowerCase());
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}