Search code examples
javastringjava-8apache-commons-text

Capital letters after punctuation marks, but with some exceptions


My exceptions are:

  • Capital letter at the beginning of the sentence or after a punctuation mark
  • Add space after punctuation mark
  • After an abbreviation, don't use capital letters
  • After a "- " (with a space), use capital letters
  • After a "-" (no space), don't capitalize

My code below with the exceptions:

private static char[] PUNCTUATION_MARKS = { '?', '!', ';', '.', '-' };

applyCorrectCase("step 1 - take your car");
applyCorrectCase("reply-to address");
applyCorrectCase("req. start");
applyCorrectCase("alt. del. date [alternate detection date]");
applyCorrectCase("you are.important for us? to continue?here! yes");

String applyCorrectCase(String value) {
    String lowerCaseValue = value.toLowerCase();
    if (value.contains(". ")) {
        lowerCaseValue = value.replace(". ", ".");
    }
    lowerCaseValue = WordUtils.capitalize(lowerCaseValue, PUNCTUATION_MARKS );
    System.out.println(lowerCaseValue.replace(".", ". "));
}

These are my result:

Step 1 - take your car <--- The 't' after the '-' need to be uppercase
Reply-To address <--- The 't' after the '-' need to be lowercase
Req. Start <--- The 's' after the '.' need to be lowercase because it is an abbreviation
Alt. Del. Date [alternate detection date] <--- Both 'd' after the '.' need to be lowercase because it is an abbreviation
You are. Important for us? to continue?Here! yes <--- The 't' after the '?' need to be capital, we need to add an space between '?' and 'H', the 'y' after the '!' need to be uppercase

These are my expectations:

Step 1 - Take your car
Reply-to address
Req. start
Alt. del. date [alternate detection date]
You are. Important for us? To continue? Here! Yes

Any idea to fix my code?

UPDATE

About the code if (value.contains(". ")) { and System.out.println(lowerCaseValue.replace(".", ". ")); I did these before I had more punctuation marks to check, now that I have more it doesn't work


Solution

  • I would approach this problem in different steps.

    Reordering the spec, you have three different categories

    1. Normalizing spaces: add a space after a punctuation mark if it's missing
    2. Capital letters: at the beginning of a sentence, after a punctuation mark, after a hyphen that is followed by a space
    3. No capital letters: after an abbreviation or after a hyphen that isn't followed by a space

    To do this you need to define your abbreviations (because otherwise what would differentiate an abbreviation from the end of a sentence?).

    Then do the following, in order

    1. Normalizing spaces

    Look for punctuation marks that aren't followed by spaces and add spaces as needed

    2. Capital letters

    Find the punctuation marks (all of them) and all hyphens that are followed by spaces. In all these points, make the first letter uppercase. Make the first letter of the string (beginning of first sentence) also uppercase.

    3. No capital letters

    Referencing the abbreviations defined, if you find an abbreviation pattern in the string such that it's followed by a period, the following letter after the space must be lowercase.

    If you find a hyphen that is followed immediately by an uppercase letter, that letter must become lowercase.

    EDIT

    I think this should do the trick for the test cases provided. I'm sure it can be fine-tuned but it's enough to get started:

    private static final String[] PUNCTUATION_MARKS = { "\\?", "\\!", ";", "\\." };
    
    private static final String[] ABBREVIATIONS = {
            "Req", "req",
            "Alt", "alt",
            "Del", "del",
    };
    
    public static void main(String[] args) {
        applyCorrectCase("step 1 - take your car");
        applyCorrectCase("reply-to address");
        applyCorrectCase("req. start");
        applyCorrectCase("alt. del. date [alternate detection date]");
        applyCorrectCase("you are.important for us? to continue?here! yes");
    
    }
    
    static String applyCorrectCase(String value) {
        String lower = value.toLowerCase();
        // have only one space where there are multiple
        lower.replaceAll("\\s+", " ");
        for (String p : PUNCTUATION_MARKS) {
            // add a space after a punctuation mark that doesn't have one
            lower = lower.replaceAll(p, p + " ")
                    .replaceAll("\\s+", " ");
        }
        char[] chars = lower.toCharArray();
        chars[0] = Character.toUpperCase(chars[0]);
        for (int i = 0; i < chars.length; i++) {
            // capitalize the first letter after the space that follows a punctuation mark or a hyphen
            for (char p : new char[]{ '?', '!', ';', '.', '-' }) {
                if (chars[i] == p && i < chars.length - 2 && chars[i + 1] == ' ') {
                    chars[i + 2] = Character.toUpperCase(chars[i + 2]);
                }
            }
        }
        // search for abbreviations
        String tmp = new String(chars);
        List<Pattern> patterns = new ArrayList<>();
        for (String a : ABBREVIATIONS) {
            patterns.add(Pattern.compile("(" + a + "\\. )([A-Z])"));
        }
        for (Pattern p : patterns) {
            Matcher m = p.matcher(tmp);
            while (m.find()) {
                tmp = tmp.replaceAll(m.group(), m.group(1) + m.group(2).toLowerCase());
            }
        }
        System.out.println(tmp);
        return tmp;
    }