I need to read lots of text files to develop my project. Each file contains tweets and retweets of a person. I wrote simple java code to do that. I also tried to read the files using c code. it is showing same problems as well The program can read some lines properly, but in some cases in it breaking the lines and reading 1 single line into two different lines. In some places the program is inputting new lines as well.
I need to read the files as it is they are. Could you kindly let me know, is it due to the inputs of files or due to some other reason. Is there any solution? thanks
Below is my code which is very simple.
public class Check {
public static void main(String[] args) throws FileNotFoundException, IOException {
File InfileName = new File ("c:/users/syeda/desktop/12.txt");
Scanner in = new Scanner(new FileReader(InfileName));
String line="";
int lineNo=0;
while(in.hasNext()== true)
{
line = in.nextLine();
System.out.println(line);
lineNo++;
}
System.out.println(lineNo);
}
}
My input file contains only 800 lines but it is showing 819 lines as output. The extra 19 lines are some blank lines which are not in the input files and some lines from input file are broken into two lines and showing the extra 19 lines
Your file has multiple line separators in a row. That is where the blank lines are coming from.
\n\n
will count as an empty line, Windows is probably \n\r\n\r
.
End of line markers are invisible in things like TextPad
you have \n
or \n\r
where you do not think they are, it is that simple.
Code is correct, data is wrong.
Also Scanner
is the wrong choice, BufferedReader
would be a better solution.