I want to split Persian Date number from stick words in java. My string is like : "۰۱/۰۷/۱۳۹۵سعید"
I search too much, But I cant find appropriate one, that works for me!! In addition Date format might completely Wrong, its important to separate word from numbers.
I want to reach some thing Like "۰۱/۰۷/۱۳۹۵ سعید"
Here is my solution. It adds spaces to the String as you requested. In my main
method, I give سعید۰۱/۰۷/۱۳۹۵سعید
as input and get سعید ۰۱/۰۷/۱۳۹۵ سعید
printed on the console.
public class StringPadder {
private static final String BETWEEN_NUMBER_AND_LETTER = "(?<=\\p{IsDigit})(?=\\p{IsAlphabetic})";
private static final String BETWEEN_LETTER_AND_NUMBER = "(?<=\\p{IsAlphabetic})(?=\\p{IsDigit})";
public static String addSpaces(String toPad) {
return toPad.replaceAll(BETWEEN_NUMBER_AND_LETTER, " ").replaceAll(BETWEEN_LETTER_AND_NUMBER, " ");
}
public static void main(String[] args) {
String toTest = "سعید۰۱/۰۷/۱۳۹۵سعید";
System.out.println(addSpaces(toTest));
}
}
This works by some regular expression tricks.
\p{IsDigit}
matches a digit in any alphabet; so not just 0-9, but also Arabic/Persian numbers, Devanagari numbers and so on. \p{IsAlphabetic}
matches a letter in any alphabet; so not just A-Z and a-z but also the Arabic/Persian alphabet and other alphabets.(?<=X)
in a regular expression, it means that the match you're looking for must be preceded by something that matches X
, but the match for X
won't be part of the match that you find. This is called a "lookbehind", because it says "look behind what you're matching, and see if it's X
". (?=X)
in a regular expression, it means that the match you're looking for must be followed by something that matches X
, but the match for X
won't be part of the match that you find. This is called a "lookahead", because it says "look ahead of what you're matching, and see if it's X
". Putting all that together, I've included two regular expressions in the code, namely BETWEEN_NUMBER_AND_LETTER
and BETWEEN_LETTER_AND_NUMBER
. Each of these matches nothing at all, because they contain no characters that actually match anything. But each one has a lookbehind and a lookahead. So BETWEEN_NUMBER_AND_LETTER
matches "nothing at all" with a number before it and a letter after it; and BETWEEN_LETTER_AND_NUMBER
matches "nothing at all" with a letter before it and a number after it.
What you need to do is to replace either of those nothings with a space, because that will separate any letter from any number, provided they were consecutive characters in the original String
. That's what my addSpaces
method does - it first puts a space at any point in the String
where there was a number immediately followed by a letter, then it puts a space at any point where there was a letter immediately followed by a number.
My test case, in main
demonstrates that this is what you required.