Search code examples
c#regexeventsnesper

NEsper issue with regexp


I have been stuck here for a good while and seem to nail the problem to incorrect NEsper behaviour with regex. I wrote a simple project to reproduce the issue and it is available from github.

In a nutshell, NEsper allows me to pump messages (events) through a set of rules (SQL-like). If an event matches a rule, NEsper fires an alert. In my application I need to use a regular expression and this doesn't seem to work.

Problem
I tried both approaches of creating statements createPattern and createEPL and they are not firing a match event, however a regular expression and an input are matching by the .NET Regex class. If instead of regex ("\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b") I pass a matching value ("127.0.0.5") to the statement, the event successfully fires.

INPUT 
127.0.0.5

==RULE FAIL==
every (Id123=TestDummy(Value regexp '\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b'))
// and I want this to pass

==RULE PASS==
every (Id123=TestDummy(Value regexp '127.0.0.5'))

Question
Could anyone help me out with a sample of NEsper regular expression matching? Or perhaps point to my dumb mistake in the code.

Code
This is my NEsper demo wrapper class

public class NesperAdapter
{
    public MatchEventSubscrtiber Subscriber { get; set; }
    internal EPServiceProvider Engine { get; private set; }

    public NesperAdapter()
    {
        //This call internally depend on log4net, 
        //will throw an error if log4net cannot be loaded 
        EPServiceProviderManager.PurgeDefaultProvider();

        //config
        var configuration = new Configuration();
        configuration.AddEventType("TestDummy", typeof(TestDummy).FullName);
        configuration.EngineDefaults.Threading.IsInternalTimerEnabled = false;
        configuration.EngineDefaults.Logging.IsEnableExecutionDebug = false;
        configuration.EngineDefaults.Logging.IsEnableTimerDebug = false;

        //engine
        Engine = EPServiceProviderManager.GetDefaultProvider(configuration);
        Engine.EPRuntime.SendEvent(new TimerControlEvent(TimerControlEvent.ClockTypeEnum.CLOCK_EXTERNAL));
        Engine.Initialize();
        Engine.EPRuntime.UnmatchedEvent += OnUnmatchedEvent;
    }

    public void AddStatementFromRegExp(string regExp)
    {
        const string pattern = "any (Id123=TestDummy(Value regexp '{0}'))";
        string formattedPattern = String.Format(pattern, regExp);
        EPStatement statement = Engine.EPAdministrator.CreatePattern(formattedPattern);

        //this is subscription
        Subscriber = new MatchEventSubscrtiber();
        statement.Subscriber = Subscriber;
    }

    internal void OnUnmatchedEvent(object sender, UnmatchedEventArgs e)
    {
        Console.WriteLine(@"Unmatched event");
        Console.WriteLine(e.Event);
    }

    public void SendEvent(object someEvent)
    {
        Engine.EPRuntime.SendEvent(someEvent);
    }
}

Then subscriber and a DummyType

public class MatchEventSubscrtiber
{
    public bool HasEventFired { get; set; }

    public MatchEventSubscrtiber()
    {
        HasEventFired = false;
    }

    public void Update(IDictionary<string, object> rows)
    {
        Console.WriteLine("Match event fired");
        Console.WriteLine(rows);

        HasEventFired = true;
    }
}

public class TestDummy
{
    public string Value { get; set; }
}

And NUnit test. If one comments nesper.AddStatementFromRegExp(regexp); line and uncomments //nesper.AddStatementFromRegExp(input); line then test pass. However I need a regular expression there.

//Match any IP address
[TestFixture(@"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b", "127.0.0.5")] 
public class WhenValidRegexpPassedAndRuleCreatedAndPropagated
{
    private NesperAdapter nesper;

    //Setup
    public WhenValidRegexpPassedAndRuleCreatedAndPropagated(string regexp, string input)
    {
        //check it is valid regexp in .NET
        var r = new Regex(regexp);
        var match = r.Match(input);
        Assert.IsTrue(match.Success, "Regexp validation failed in .NET");

        //create and start engine
        nesper = new NesperAdapter();

        //Add a rule, this fails with a correct regexp and a matching input
        //PROBLEM IS HERE 
        nesper.AddStatementFromRegExp(regexp);
        //PROBLEM IS HERE 

        //This works, but it is just input self-matching
        //nesper.AddStatementFromRegExp(input);

        var oneEvent = new TestDummy
        {
            Value = input
        };

        nesper.SendEvent(oneEvent);
    }

    [Test]
    public void ThenNesperFiresMatchEvent()
    {
        //wait till nesper process the event
        Thread.Sleep(100);

        //Check if subscriber has received the event
        Assert.IsTrue(nesper.Subscriber.HasEventFired,
            "Event didn't fire");
    }
}

Solution

  • I was debugging this issue for some time now and found that NEsper incorrectly handles

    WHERE regexp 'foobar' statement

    So if I have

    SELECT * FROM MyType WHERE PropertyA regexp 'some valid regexp'

    NEsper performs string formatting and validation with 'some valid regexp' and removes important (and valid) symbols from regexp. This is how I fixed it for myself. Not sure if it is a recommended approach.

    File: com.espertech.esper.epl.expression.ExprRegexpNode

    Reason: I think it is up to the user how regexp is constructed, this shall not be part of a framework.

    // Inside this method
    public object Evaluate(EventBean[] eventsPerStream, bool isNewData, ExprEvaluatorContext exprEvaluatorContext){...}
    
    // Find two occurrences of
    _pattern = new Regex(String.Format("^{0}$", patternText));
    
    // And change to
    _pattern = new Regex(patternText);
    

    File: com.espertech.esper.epl.parse.ASTConstantHelper

    Reason: requireUnescape for all strings, but skip regexp as this brakes valid regexp and removes some valid symbols from it.

    // Inside this method  
    public static Object Parse(ITree node){...}
    
    // Find one occurrence of
    case EsperEPL2GrammarParser.STRING_TYPE:
    {
        return StringValue.ParseString(node.Text, requireUnescape);
    }
    
    // And change to
    case EsperEPL2GrammarParser.STRING_TYPE:
    {
    bool requireUnescape = true;
    
    if (node.Parent != null)
    {
        if (!String.IsNullOrEmpty(node.Parent.Text))
        {
            if (node.Parent.Text == "regexp")
            {
                requireUnescape = false;
            }
        }
    }
    
    return StringValue.ParseString(node.Text, requireUnescape);
    }
    

    File: com.espertech.esper.type.StringValue

    Reason: unescape all strings, but the regexp value.

    // Inside this method  
    public static String ParseString(String value){...}
    
    // Change from
    public static String ParseString(String value)
    {
        if ((value.StartsWith("\"")) & (value.EndsWith("\"")) || (value.StartsWith("'")) & (value.EndsWith("'")))
        {
            if (value.Length > 1)
            {               
                if (value.IndexOf('\\') != -1)
                {
                    return Unescape(value.Substring(1, value.Length - 2));
                }
    
                return value.Substring(1, value.Length - 2);
            }
        }
    
        throw new ArgumentException("String value of '" + value + "' cannot be parsed");
    }   
    
    // Change to
    public static String ParseString(String value, bool requireUnescape = true)
    {
        if ((value.StartsWith("\"")) & (value.EndsWith("\"")) || (value.StartsWith("'")) & (value.EndsWith("'")))
        {
            if (value.Length > 1)
            {
                if (requireUnescape)
                {
                    if (value.IndexOf('\\') != -1)
                    {
                        return Unescape(value.Substring(1, value.Length - 2));
                    }
                }
    
                return value.Substring(1, value.Length - 2);
            }
        }
    
        throw new ArgumentException("String value of '" + value + "' cannot be parsed");
    }