Search code examples
javastringcollectionshashmap

Java8: Create HashMap with character count of a String


Wondering is there more simple way than computing the character count of a given string as below?

String word = "AAABBB";
    Map<String, Integer> charCount = new HashMap();
    for(String charr: word.split("")){
        Integer added = charCount.putIfAbsent(charr, 1);
        if(added != null)
            charCount.computeIfPresent(charr,(k,v) -> v+1);
    }

    System.out.println(charCount);

Solution

  • Simplest way to count occurrence of each character in a string, with full Unicode support (Java 11+)1:

    String word = "AAABBB";
    Map<String, Long> charCount = word.codePoints().mapToObj(Character::toString)
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    System.out.println(charCount);
    

    1) Java 8 version with full Unicode support is at the end of the answer.

    Output

    {A=3, B=3}
    

    UPDATE: For Java 8+ (doesn't support characters from supplemental planes, e.g. emoji):

    Map<String, Long> charCount = IntStream.range(0, word.length())
            .mapToObj(i -> word.substring(i, i + 1))
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    

    UPDATE 2: Also for Java 8+.

    I was mistaken, thinking that codePoints() wasn't added until Java 9. It was added in Java 8 to the CharSequence interface, so it doesn't show in javadoc for String in Java 8, and shows as added in Java 9 for later versions of the javadoc.

    However, the Character.toString​(int codePoint) method wasn't added until Java 11, so to use the Character.toString​(char c) method, we can use chars() in Java 8:

    Map<String, Long> charCount = word.chars().mapToObj(c -> Character.toString((char) c))
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
    

    Or for full Unicode support, incl. supplemental planes, we can use codePoints() and the String(int[] codePoints, int offset, int count) constructor, in Java 8:

    Map<String, Long> charCount = word.codePoints()
            .mapToObj(cp -> new String(new int[] { cp }, 0, 1))
            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));