I have looked through the suggested "already answered" questions for this. Mostly they want simply to discard such "non-printable" input. I want to use it.
I am getting a UTF8 String
returned from keyboard input using
BufferedReader br = new BufferedReader( new InputStreamReader(System.in, 'UTF-8' ));
String response = br.readLine();
and I am interested in identifying whether the user has input, for example, up-arrow or down-arrow as one of their keystrokes.
Iterating through the char
s in this String
I find that down-arrow translates to (int
value for char
) 27, 91, 66, i.e. 3 char
s. The first value corresponds to Escape
. It seems therefore that this is not a matter of identifying a single Character
and finding out whether it is non-printable.
Also I'm not clear why this control character can't be printed out as a single UTF8 character, but instead prints out as the 3 component parts of the UTF8 character: does this mean that when you iterate through a String
you are in fact getting its contents byte-by-byte?
I just wonder if there is any documented or clever way of doing this (finding and identifying control characters) in a given UTF8 String. Perhaps Apache Commons. Or perhaps in Groovy (which I am in fact using, rather than Java)?
You can test for a real control character using the Character::isISOControl
methods (javadoc).
However, as noted in the comments, up-arrow and down-arrow are keystrokes rather than characters. What they actually produce in the input stream are platform dependent. For example, if you are using an ANSI-compliant terminal or terminal emulator, an up-arrow will be mapped to the sequence ESC [ A
. If you simply filter out the ISO control characters, you will remove the ESC
only.
I don't think there is a reliable platform independent way to filter out the junk that results from a user mistakenly typing arrow keys. For a platform specific solution, you need to understand what specific sequences are produced by the user's input device. Then you detect and remove the sequences.