Search code examples
javaintellij-ideaunicodeidescjp

Which declarations are valid?


Select the three correct answers (valid declarations).

(a) char a = '\u0061';

(b) char 'a' = 'a';

(c) char \u0061 = 'a';

(d) ch\u0061r a = 'a';

(e) ch'a'r a = 'a';

Answer: (a), (c) and (d)

Book:

A Programmer's Guide to Java SCJP Certification (Third Edition)

Can someone please explain the reason for the option (c) and (d) as the IDE (IntelliJ IDEA) is showing it in red saying:

Cannot resolve symbol 'u0063'

As shown in IntelliJ IDEA


Solution

  • The compiler can recognise Unicode escapes and translate them to UTF-16. ch\u0061r will become char which is a valid primitive type. It makes option D correct.

    3.3. Unicode Escapes

    A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) for the indicated hexadecimal value, and passing all other characters unchanged.

    \u0061 will be translated to a which is a valid Java letter that can be used to form an identifier. It makes option C correct.

    3.8. Identifiers

    An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.

    Identifier:
        IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral
    IdentifierChars:
        JavaLetter {JavaLetterOrDigit}
    JavaLetter:
        any Unicode character that is a "Java letter"
    JavaLetterOrDigit:
        any Unicode character that is a "Java letter-or-digit"
    

    A "Java letter" is a character for which the method Character.isJavaIdentifierStart(int) returns true.

    A "Java letter-or-digit" is a character for which the method Character.isJavaIdentifierPart(int) returns true.

    The "Java letters" include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII dollar sign ($, or \u0024) and underscore (_, or \u005f). The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems. The underscore may be used in identifiers formed of two or more characters, but it cannot be used as a one-character identifier due to being a keyword.