Search code examples
javastringstringbuilderpmdstringbuffer

Running pmd on code with StringBuilder gives error about initialized size and appended size


private static String buildSomeString(Map<String, String> data) {
    StringBuilder result = new StringBuilder();
    for (Map.Entry<String, String> field : data.entrySet()) {
       result.append("some literal")
            .append(field.getKey())
            .append("another literal")
            .append(field.getKey())
            .append("and another one")
            .append(field.getValue())
            .append("and the last in this iteration");
     }
     return result.toString();
}

When I run pmd on this I get the following error

StringBuffer constructor is initialized with size 16, but has at least 83 characters appended.

The number of characters is probably wrong, because I changed literals before posting.

Thanks


Solution

  • StringBuilder's constructor can optionally receive an int with the size of the internal buffer to use. If none given (as is in your code), it defaults to 16.

    As you append data on the StringBuilder, it will automatically resize the internal buffer as needed. This resizing implies creating a new, larger array, and copying the old data to it. This is "a costly" operation (notes the quotes, this is a micro-optimization, if you are using bad algorithms such bubble sort you have bigger problems).

    Doing a more educated guess on the expected size of the string can avoid / minimize such reallocations.

    PMD doesn't know what the contents of the map are, but it knows it will include at least 83 chars (given the map is not empty).

    This can be resolved by doing a more educated guess on the size, such as:

    StringBuilder result = new StringBuilder(83 * data.size()); // 83 or whatever you constant strings account for
    

    This can be further refined if you can better approach the expected value of the map's keys and values. Usually, going slightly over the actual expected output is better, as even if it implies allocating more memory, has a better chance of avoiding reallocations completely.