Search code examples
javareadfile

How to read data from txt and save into array every 2 lines and remove the header


I want to read the data from text, then I will remove the header of the text and save the data into and array every 2 line, cause it still continues data.

visitor.txt

    1                   DAILY REPORT VISITOR
                            DATE : 02-02-22
    0+------------------------------------------------------------------+
        NO.     DATE            NAME                ADDRESS
        PHONE           BIRTHDAY        NEED                              
     +------------------------------------------------------------------+
          1     02-02-22        ELIZABETH ZEE       WASHINGTON DC
        +32 62          18-10-1985      BORROW BOOK
          2     02-02-22        VICTORIA GEA        BRUSEELS
        +32 64          24-05-1986      VISITOR
          3     02-02-22        GEORGE PHILIPS      BRUSEELS
        +32 76          02-05-1990      VISITOR 

I want the data that save into an array like this.

1       02-02-22        ELIZABETH ZEE       WASHINGTON DC       +32 62          18-10-1985      BORROW BOOK
2       02-02-22        VICTORIA GEA        BRUSEELS            +32 64          24-05-1986      VISITOR
3       02-02-22        GEORGE PHILIPS      BRUSEELS            +32 76          02-05-1990      VISITOR

This is the code

BufferedReader bR = new BufferedReader(new FileReader(myfile));
int i =0;

String line;
        
try {
    while (line = bufferedReader.readLine()) != null) {
    i++;
    String data = line.split("\\s", "")

    if(data.matches("[0-9]{1,3}\\s.+")) {
        String[] dataArray = data.split("\\s", -1);
        String[] result = new String[30];

        System.arraycopy(fileArray, 0, result, 0, fileArray.length);

        String data1 = line.get(i).split("\\s", "")
        String[] fileArray1 = data.split("\\s", -1);
        String[] result1 = new String[30];
 
        System.arraycopy(fileArray1, 0, result1,0,fileArray1.length);       
    }
    
}

The problem here is, I think this code is not effective cause it will be read the second line twice from data and data1. I want every 2 lines will save into one row in the database like the result of text. Do you have any solution?


Solution

    • The result has actually a dynamic number of records. Then a fixed size array no longer is suitable. Use List<String\[\]> instead: list.add(stringArray), list.get(i), list.size(), list.isEmpty().
    • The header seems to consist of 2 lines, but I may err.
    • I saw fields with a space, hence one cannot split on \s+ (one or more whitespace characters). I did split on \s\s+. Maybe you should better use the fixed length field boundaries with line1.substring(i1, i2).
    • FileReader uses the encoding on your current computer (=unportable file). I have made it explicit. If it always an US-ASCII file, without special characters, you could use StandardCharsets.US_ASCII. Then you can run the software on a Linux server, that normally uses UTF-8.

    So without check of data format (which however makes sense):

    private void stackOverflow() throws IOException {
        List<String[]> data = loadData("mydata.txt");
        System.out.println(data.size() + " records read");
        for (String[] fields: data) {
            System.out.println(Arrays.toString(fields));
        }
    }
    
    private List<String[]> loadData(String myFile) throws IOException {
        List<String[]> data = new ArrayList<>();
        Path path = Paths.get(myFile);
        try (BufferedReader bufferedReader =
                Files.newBufferedReader(path, Charset.defaultCharset())) {
            if (bufferedReader.readLine() != null
                    && bufferedReader.readLine() != null) { // Skip both header lines.
                String line1, line2;
                while ((line1 = bufferedReader.readLine()) != null
                        && (line2 = bufferedReader.readLine()) != null) {
    
                    String[] fields1 = line1.split("\\s\\s+", 4); // Split on at least 2 spaces.
                    if (fields1.length != 4) {
                        throw new IOException("Wrong number of fields for first line: " + line1);
                    }
                    String[] fields2 = line2.split("\\s\\s+", 3); // Split on at least 2 spaces.
                    if (fields1.length != 3) {
                        throw new IOException("Wrong number of fields for second line: " + line2);
                    }
                    String[] total = Arrays.copyOf(fields1, 7);
                    System.arraycopy(fields2, 0, total, 4, fields2.length);
                    ;
                    data.add(total);
                }
                if (line1 != null && !line1.isBlank()) {
                    throw new IOException("Trailing single line: " + line1);
                }
            }
        }
        return data;
    }
    
    • Substring is better, safer, than split.

    • Instead of String[] you might use record class (since java 14)

      record Visitor(String no, String date, String name, String address,
              String phone, String birthday, String need) { }
      List<Visitor> data = new ArrayList<>();
      data.add(new Visitor(fields1[0], fields1[1], fields1[2], fields1[3],
          fields2[0], fields2[1], fields2[2]);
      

    A record need little code, however cannot be changed, only replaced in the list.