Search code examples
javajava-stream

Casting to (char) in Stream.map() Does Not Produce Desired Result


java.util.stream is challenging my notion of chars. I am attempting to perform a simple character rotation of a string, where I simply add 1 to the int value of each character. I start with "abcd".chars() (a stream), and then I map. But I don't seem to be able to cast the ints -> char during the map. Why am I not getting the desired result?

This is vanilla JDK17 (temurin).

    jshell> "abdc".chars().map(c -> c + 1).collect(StringBuilder::new, StringBuilder::append, StringBuilder::append)
$43 ==> 9899101100

Wait; thats not what I want. These are the int values! Let's try a cast to char:

jshell> "abdc".chars().map(c -> (char)c + 1).collect(StringBuilder::new, StringBuilder::append, StringBuilder::append)
$44 ==> 9899101100

Ok, that had zero effect. Let's use an explicit map step:

jshell> "abdc".chars().map(c -> c + 1).map(c1 -> (char)c1).collect(StringBuilder::new, StringBuilder::append, StringBuilder::append)
$45 ==> 9899101100

Huh? wtf? Ok, let's force the cast during the reduce step:

jshell> "abcd".chars().map(c -> c + 1).collect(()->new StringBuilder(), (c,d)->c.append((char)d), (c1, c2)->c1.append(c2))
$46 ==> bcde

Finally! But WHY? Why doesn't the map to char flow across the stream? What am I missing? I know this is a newbie question, but I would appreciate being educated.


Solution

  • As to why you have the problem, see the Answer by Jon Skeet.

    You can avoid the entire problem easily by using code point integers rather than the broken legacy type char.

    Furthermore, your code fails with most characters. Again, solved by using code points.

    Code point, not char

    The char type has been essentially broken since Java 2, and legacy since Java 5. As a 16-bit value, the char is physically incapable of representing most of the over 149,000 characters defined in Unicode.

    While char is limited to a range of 0 to just over 65,000, code points range from zero to just over a million. Most of that range is not yet assigned, and some is set aside permanently for private use.

    Let’s modify your code:

    Voilà, no more casting confusion.

    String s =
        "abdc"
        .codePoints()  // Returns an `IntStream` object. 
        .map( codePoint -> codePoint + 1 )
        .collect( 
            StringBuilder::new , 
            StringBuilder::appendCodePoint , 
            StringBuilder::append 
        )              // Returns a `StringBuilder` object. 
        .toString() ;
    

    See this code run at Ideone.com.

    bced