For input String id, I want to do 4 steps like below :
And Here is 4 Line of my code :
id = id.replaceAll("[^" + "a-z" + "0-9" + "-" + "_" + "." + "]", "");
id = id.replaceAll(".{2,}",".");
id = id.replaceAll("^.","");
id = id.replaceAll(".$","");
I found the return of rule 2 will be "." (ex : he...llo -> .) and rule 3,4 will remove string which is not "."
So I fix the code like :
id = id.replaceAll("[^" + "a-z" + "0-9" + "-" + "_" + "." + "]", "");
id = id.replaceAll("\\.{2,}",".");
id = id.replaceAll("\\^.","");
id = id.replaceAll("\\.$","");
And it works fine. I just don't understand. Is that regular expression need to add "\" twice before it uses? If it is right, why rule 1 work just fine? Who can get me right answer specifically? at last, I wonder can I code rule 3 and rule 4 at once? like using && to ?
.
in a regular expression means "match any single character"\.
in a regular expression means "match a single dot/period/full-stop character". A different way to write this would be [.]
, which has the same end result, but is semantically different (I'm not sure if this has a negative impact on the generated code to match the expression)[abc.]
in a regular expression means "match a single character that must be 'a' or 'b' or 'c' or '.'" ([^…]
inverts the meaning: match any character that is not). Attention: -
has special meaning in a character class, so make sure you always put it first or last if you want to match the hyphen character specfically.As for why the backslash has to be duplicated: Java itself uses the backslash to escape characters in a string. To get a literal backslash as part of the string, you have to escape the backslash itself: "\\"
is a string containing a single backslash character ("\"
is a syntax error in Java, because the backslash escapes the following quotation mark, i.e. the string is never terminated).
To reduce your logic down to two replaceAll
calls, I would suggest to change the order of your calls and then join your expressions as alternatives with the |
operator:
id = id.replaceAll(".+", ".") // fold all dots
.replaceAll("[^a-z0-9_.-]|^\\.|\\.$", "");