Search code examples
javaparsingiojava.util.scannerbufferedreader

Is there any parsing when reading 'clean' text file using Scanner?


I know that:

Parsing is the process of turning some kind of data into another kind of data.

But then I also came across this difference between Scanner and BufferedReader:

BufferedReader is faster than Scanner because BufferedReader does not need to parse the data.

So my question is how is using Scanner slower than using BufferedReader if I am reading just text file (plain characters) and I am not doing any parsing? Is there any parsing I am not aware of?

Or from following code perspective, how is here Scanner slower because of parsing than using BufferedReader?

//1
BufferedReader bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
System.out.println(bufferedReader.readLine());
    
//2
Scanner scanner = new Scanner(new FileReader("xanadu.txt"));
scanner.useDelimiter("\n");
System.out.println(scanner.next());

I don't understand quote how Scanner is slower because of parsing, when I am technically not parsing any data..


Solution

  • Dividing an input stream into lines is a (very limited) form of parsing, but as you say BufferedReader can also do that. The difference, if there is one, will be that BufferedReader can use a highly-optimised procedure to implement a single use case (divide a stream into lines) while Scanner needs to be able to be considerably more flexible (divide a stream into tokens delimited by an arbitrary string or regular expression). Flexibility almost always comes at a price, although you won't know what that cost is without doing some benchmarking. (And it may be very small, since it is conceivable that Scanner has optimised algorithms for particular special cases which it can recognise.)

    In short, "because parsing" is not a very good explanation for why one interface is slower than another one. But the more flexibly and precisely you parse an input, the more time it is expected to take.