I need to convert any arbitrary string:
to a valid Java identifier:
Is there an existing tool for this task?
With so many Java source refactoring/generating frameworks one would think this should be quite common task.
This simple method will convert any input string into a valid java identifier:
public static String getIdentifier(String str) {
try {
return Arrays.toString(str.getBytes("UTF-8")).replaceAll("\\D+", "_");
} catch (UnsupportedEncodingException e) {
// UTF-8 is always supported, but this catch is required by compiler
return null;
}
}
Example:
"%^&*\n()" --> "_37_94_38_42_10_56_94_40_41_"
Any input characters whatsoever will work - foreign language chars, linefeeds, anything!
In addition, this algorithm is:
str1.equals(str2)
Thanks to Joachim Sauer for the UTF-8
suggestion
If collisions are OK (where it is possible for two inputs strings to produce the same result), this code produces a readable output:
public static String getIdentifier(String str) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if ((i == 0 && Character.isJavaIdentifierStart(str.charAt(i))) || (i > 0 && Character.isJavaIdentifierPart(str.charAt(i))))
sb.append(str.charAt(i));
else
sb.append((int)str.charAt(i));
}
return sb.toString();
}
It preserves characters that are valid identifiers, converting only those that are invalid to their decimal equivalents.