Search code examples
c#.netregex

DotNet equivalent to Java Matcher.hitEnd()


I am trying to set up a series of regular expressions to match unknown incoming data.

What I want to be able to to do is tell whether each expression matches, can never match, or might match the data so far.

For example, if we have the following regular expressions:

  • \Aaaab
  • \Aaaac
  • \Abbb

Then if we have so far received aa then the first two might match, depending on what comes next, but the third will never match.

If we receive some more and get aaac then the first won't match, the second does match, and the third still won't match.

In Java this can be done like so:

Pattern pattern = Pattern.compile("\Aaaab");
Matcher matcher = pattern.matcher(receivedSoFar);
if (matcher.matches())
{
  // A match has been found now
}
else if (matcher.hitEnd())
{
  // Doesn't match yet, but don't write it off, wait for more data.
}
else
{
  // This pattern doesn't match at all, we can rule it out.
}

See here for a working example.

The distinction between "might match" and "doesn't match" is important because if we reach the point where none of the patterns can possibly match then we need to trigger an error state response.

I have been looking for a way to implement this in C# DotNet and I haven't been able to find one. The System.Text.RegularExpressions.Match class only has a Success property which tells us whether the match has been found but can't tell us whether a future match is possible.

Is there any way to do this in C# DotNet?

I have tried google, the learn.microsoft.com documentation and even, out of desperation, ChatGPT. None came up with any suggestions.

Edited: Sorry, I accidentally referenced requireEnd where I meant hitEnd.


Solution

  • Have a look at https://github.com/ltrzesniewski/pcre-net (Perl Compatible Regular Expressions for .NET)

    There is a section about partial matching in the readme, it might cover your use case.

    I don't know a good way to solve the requirement with the .Net internal Regex implementation, so using an alternative implementation nuget might be your best bet.