I'm trying to build a finite state machine and I want to check the sequence that I get, with a regular expression. I need to check if the sequence is from the the following form:
For example:
"A,B,C,C,C,C,C,A"
-> is accepted.
"A,B,C,C,C,C,A"
-> is ignored.
"A,B,C,C,C,C,C,C,A"
-> is ignored.
I found this post and that post, but everything I tried simply doesn't work.
I tried the next things: A\B\D{5}\A
, ABD{5}A
and a couple more, but again with no success.
EDIT: I want to know if the C character is return exactly 5 times, before and after doesn't matter at all, meaning it could be like this also:
A,A,A,F,F,R,E,D,C,C,C,C,C, ......
Don't consider the commas.
The problem is that I need to find if a sequence is accepted but, the sequence is from the next form: A,B, C*10, I created the machine class, the state class and the event class. But now I need to know if I have exactly 5 returns of C, and it causing me a lot of problems.
EDIT: It's not working, see the code Iv'e added.
String sequence1 = "A,B,C,C,C,C,A";
String sequence2 = "A,B,C,C,C,C,C,A";
String sequence3 = "A,B,C,C,C,C,C,C,A";
Pattern mPattern = Pattern.compile("(\\w)(?:,\\1){4}");
Matcher m = mPattern.matcher(sequance1);
m.matches(); //FALSE
Matcher m = mPattern.matcher(sequance2);
m.matches(); //FALSE
Matcher m = mPattern.matcher(sequance3);
m.matches(); //FALSE
It's returning always false.
How can I achieve this?
Thanks.
Your regex is not working because you are not considering the comma in your string, which I assume is available.
You can try the following regex (I'm posting here a generalized pattern, you can modify it accordingly): -
"(\\w)(?:,\\1){4}"
This will match any 5 sequence of same characters separated by comma.
\1
is used to backreference the 1st matched character, and the rest of the 4 characters should be the same as that.
Explanation: -
"( // 1st capture group
\\w // Start with a character
)
(?: // Non-capturing group
, // Match `,` after `C`
\\1 // Backreference to 1st capture group.
// Match the same character as in (\\w)
){4}" // Group close. Match 4 times
// As 1st one we have already matched in (\\w)
UPDATE: -
If you just want to match 5 length
sequence, you can add a negation of the matched character after the 5th match: -
"(\\w)(?:,\\1){4}(?!,\\1)"
(?!,\\1)
-> Is negative look-ahead assertion. It will match 5 consecutive character that are not followed by the same character.
UPDATE: -
In the above Regex, we also need to do a negative look-behind for \\1
which we can't do. So, I came up with this wierd looking Regex. Which I myself don't like, but you can try it whether it works or not: -
Not Tested: -
"(\\w),(^\\1)(?:,\\2){4}(?!,\\2)"
Explanation: -
( // First Capture Group
\\w // Any character, before your required sequence. (e.g. `A` in `A,C,C,C,C,C`)
) // Group end
, // comma after `A`
( // Captured group 2
^\\1 // Character other than the one in the first captured group.
// Since, We now want sequence of `C` after `A`
)
(?: // non-capturing group
, // Match comma
\\2 // match the 2nd capture group character. Which is different from `A`,
// and same as the one in group 2, may be `C`
){4} // Match 4 times
(?! // Negative look-ahead
,
\\2 // for the 2nd captured group, `C`
)
I don't know whether that explanation makes the most sense or not. But you can try it. If it works, and you can't understand, then I'll try to explain a little better.