I have a fix charset
which is the following one:
Capitals and lowercase:
A - Z, a-z
Numbers:
0-9
Special characters:
Ñ, É, ñ, à, @, £, $, ¥, è, é, ù, ì, ò, _, !, ", #, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, §, `, SPACE, CR, LF, €, [, ], {, |, }, ^, ~, \, ß,Ä,Ö,Ü,ä,ö,ü
I tried using the library Guava
but my String was matched to be a non ASCII only String:
if(!CharMatcher.ascii().matchesAllOf(myString)){
//String doesn't match
}
My input String was:
smsBodyBlock.setBodyContent("A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z, Ä, Ö, Ü,a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z, ä, ö, ü,0, 1, 2, 3, 4, 5, 6, 7, 8, 9,Ñ, É, ñ, à, @, £, $, ¥, è, é, ù, ì, ò, _, !, , #, %, &, ', (, ), *, +, ,, -, ., /, :, ;, <, =, >, ?, §, `, SPACE, CR, LF, €, [, ], {, |, }, ^, ~, , ß\"");
So basically the entire charset I have written above. It didn't match as ASCII
Is there any fast and reliable scalable way to check if there are other chars than my predifined ones?
I believe one of the most efficient ways would be a BitSet - checking if a character is present occurs in O(1) time. It is roughly as efficient as using an array, but only needs about one-eight of the space.
static class MyCustomMatcher {
// bits needed = last character + 1
private static final BitSet matcher = new BitSet('ü' + 1);
static {
String other = " \r\nÑÉñà@£$¥èéùìò_!\"#%&',()*+-./:;<=>?§`€[]{|}^~\ßÄÖÜäöü";
matcher.set(c, 'A', 'Z' + 1); // upper
matcher.set(c, 'a', 'z' + 1); // lower
matcher.set(c, '0', '9' + 1); // digit
for (int i = 0; i < other.length(); i++) matcher.set(other.charAt(i));
}
public static boolean matchesAll(String s) {
for (int i = 0; i < s.length(); i++) {
if (!matcher.get(s.charAt(i))) return false;
}
return true;
}
}
Then you can write
if (MyCustomMatcher.matchesAll("Hello world")) {
// do something
}
I made the class static for simplicity, but you can make it more flexible and reusable by passing the characters to match in a constructor.