Search code examples
javastringbufferedreader

Parsing a file and replacing White spaces fond within double quotes using Java


I am reading a file and trying to modify it in the following order:

  1. if line is empty trim()
  2. if line ends with \ strip that char and add next line to it.
  3. The complete line contains double quotes and there are white spaces between the quotes, replace the white space with ~. For example: "This is text within double quotes" change to : "This~is~text~within~double~quotes"

This code is working but buggy. Here is the issue when it finds a line that ends with \ and others that done.

for example: 
line 1 and \
line 2
line 3

so Instead of having

line 1 and line 2
line 3

I have this:

line 1 and line 2     line 3

Coded updated:

public List<String> OpenFile() throws IOException {
        try (BufferedReader br = new BufferedReader(new FileReader(path))) {
            String line;
            //StringBuilder concatenatedLine = new StringBuilder();
            List<String> formattedStrings = new ArrayList<>();
            //Pattern matcher = Pattern.compile("\"([^\"]+)\"");
        while ((line = br.readLine()) != null) {
            boolean addToPreviousLine;
            if (line.isEmpty()) {
                line.trim();

            }

            if (line.contains("\"")) {
                Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
                while (matcher.find()) {
                    String match = matcher.group();
                    line = line.replace(match, match.replaceAll("\\s+", "~"));
                }

            }

            if (line.endsWith("\\")) {
                addToPreviousLine = false;
                line = line.substring(0, line.length() - 1);
                formattedStrings.add(line);
            } else {
                addToPreviousLine = true;

            }

            if (addToPreviousLine) {
                int previousLineIndex = formattedStrings.size() - 1;
                if (previousLineIndex > -1) {
                    // Combine the previous line and current line
                    String previousLine = formattedStrings.remove(previousLineIndex);
                    line = previousLine + " " + line;
                    formattedStrings.add(line);
                }
            }
            testScan(formattedStrings);
            //concatenatedLine.setLength(0);
        }
        return formattedStrings;
    }

Solution

  • Update

    I'm giving you what you need, without trying to write all the code for you. You just need to figure out where to place these snippets.

    If line is empty trim()

    if (line.matches("\\s+")) {
        line = "";
        // I don't think you want to add an empty line to your return result.  If you do, just omit the continue;
        continue;
    }
    

    If line contains double quotes and white spaces in them, replace the white space with ~. For example: "This is text within double quotes" change to : "This~is~text~within~double~quotes"

    Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
    while (matcher.find()) {
        String match = matcher.group();
        line = line.replace(match, match.replaceAll("\\s+", "~"));
    }
    

    If line ends with \ strip that char and add the next line. You need to have flag to track when to do this.

    if (line.endsWith("\\")) {
        addToPreviousLine = true;
        line = line.substring(0, line.length() - 1);
    } else {
        addToPreviousLine = false;
    }
    

    Now, to add the next line to the previous line you'll need something like (Figure out where to place this snippet):

    if (addToPreviousLine) {
        int previousLineIndex = formattedStrings.size() - 1;
        if (previousLineIndex > -1) {
            // Combine the previous line and current line
            String previousLine = formattedStrings.remove(previousLineIndex);
            line = previousLine + " " + line;
        }
    }
    

    You still do not need the StringBuffer or StringBuilder. Just modify the current line and add the current line to your formattedStrings List.