I have the following problem here: I'm trying to get a element from a webpage using Watin's Find.ByText. However, I fail to use regex in C#.
This statement will return the desired element.
return this.Document.Element(Find.ByText("781|262"));
When I try to use regex, I get back the whole page.
return this.Document.Element(Find.ByText(new Regex(@"781\|262")));
I am trying to get this element:
<td>781|262</td>
I also tried
return this.Document.Element(Find.ByText(Predicate));
private bool Predicate(string s)
{
return s.Equals("781|262");
}
The above works, while this does not:
private bool Predicate(string s)
{
return new Regex(@"781\|262").IsMatch(s);
}
I now realized, in the predicate s is the whole page content. I guess the issue is with Document.Element. Any help appreciated, thank you.
Well, I did not realize the Regex will also match the body/html element too, since the pattern is obviously also included in them. I had to specify that the text must begin and end with the pattern by using ^ and $, so it only matches the desired element:
^781\u007c262$
\u007c matches |, I used this since MSDN documentation also did.
The final code:
<td>781|262</td>
return Document.TableCell(Find.ByText(new Regex(@"^\d{3}\|\d{3}$")));
Document.TableCell to speedup the search by only trying Regex on td elements.
@ is used to prevent C# from interpreting the \ as escape sequence.
^ is used to only match elements with text beginning with the following pattern \d{3} match didit 0-9 3 times
\| match | literally
\d{3} match digit 0-9 3 times
$ the element must also end with this pattern