Search code examples
javajava.util.scanner

Java Scanner hasNext() skips empty lines


The log file I want to work with is tab-separated and looks like this:

2019-06-06 10:01:02     1.0
2019-06-06 10:25:12 100.0
2019-06-06 11:02:32     2.0

I use the following code to scan through the file:

import java.util.*;
import java.io.*;

public class FirstTry {
    public static void main(String[] args) 
    {
        String fileName = "LogFile.csv";
        File file = new File(fileName);
        try
        {
            Scanner inputStream = new Scanner(file);
            while (inputStream.hasNext()){
                String data = inputStream.nextLine();
                String[] values = data.split("\\t");
                System.out.println(values[0] + "    " + values[1]);
            }
            inputStream.close();
        }
        catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

The line

System.out.println(values[0] + "    " + values[1]);

prints the following, working output:

2019-06-06 10:01:02    
2019-06-06 10:25:12    100.0
2019-06-06 11:02:32    

but

System.out.println(values[0] + "    " + values[2]);

prints:

2019-06-06 10:01:02    1.0
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException

Why is this exceptions raised vor values[2] and not values [1]?

Edit: Sublime Text Screenshot of the LogFile with the Tabs (5 in total) marked: enter image description here

Edit 2:

String[] values = data.split("\\t+");
System.out.println(values[0] + "    " + values[1]);

prints:

2019-06-06 10:01:02    1.0
2019-06-06 10:25:12    100.0
2019-06-06 11:02:32    2.0

System.out.println(values[0] + " " + values[2]);results in an java.lang.ArrayIndexOutOfBoundsException


Solution

  • Result of String[] values = data.split("\\t");

    1: ["2019-06-06 10:01:02", "", "1.0"]
    2: ["2019-06-06 10:25:12", "100.0"]
    3: ["2019-06-06 11:02:32 ", "", "2.0"]

    Note that double tabulation will split to an empty String. On line 2 there is a single tabulation which results in an ArrayOutOfBoundsException because values does not have a third value.

    As mentionned by @Thilo, a split on "\\t+" should fix your problem