Search code examples
drools

Understanding Phreak algorithm's performance


I'm trying to understand what is making Drools perform poorly in our use case. I read the Phreak documentation here: https://access.redhat.com/documentation/en-us/red_hat_decision_manager/7.4/html/decision_engine_in_red_hat_decision_manager/phreak-algorithm-con_decision-engine

It doesn't seem to mention anything regarding how node to node jumps are done. Here's an example to explain what I mean by that:

Let's say I have a Person object with three fields: name, lastName, SSN

I define a large number of rules this way: when lastName = 'Smith' and SSN = 'XXXXXXXXXX' then name = "Jane". Assuming I have a large number of people with "Smith" as last name, say 10,000, what is the complexity to get a single name given a last name and an SSN? Would it take 10,000 comparisons, or does the "Smith" node keep some form of hash map with all the underlying SSN?

What if instead of an SSN with an equality operator I used a range, like a date range for example, defining rules like this: "when lastName = 'Smith' and startedShool >= 2005 and finishedSchool <= 2010". Do nodes keep some fancy data structure to speed up the queries with range operators?

EDIT: Per request I'm adding an example to explain how the rules are set up

We have a single class called Evidence as both input and output. We build each Evidence instance in a different activation-group and add it to a set. We usually define a few catch-all, broad rules with low salience and add rules with more specific conditions and higher salience.

This is a sample of two rules with increasing salience. We define ~10K of these.

One can imagine a sort of tree structure where at each level the salience increases and the conditions become more stringent. The Evidence class functions as a sort of key-value pair, so many of them will have the same node in common (e.g. name = location).

To execute the rules, we would add two evidences (e.g. [name= location, stringValue='BBB'] and [name = code, stringValue = theCode]) and fire the rules.

rule "RULE 1::product"
    salience 0
    activation-group "product"
when
    $outputs : EvidenceSet.Builder(  )  

    Evidence( name == "location" && stringValue == "BBB" )   
then
    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("999"));


end

rule "RULE 2::product"
    salience 1
    activation-group "product"
when
    $outputs : EvidenceSet.Builder(  )  

    Evidence( name == "location" && stringValue == "BBB" )   

    Evidence( name == "code" && stringValue == "thecode" )   
then
    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));


end

Solution

  • Here is the rule template

    rule "RULE %s::product"
    when
        $outputs : EvidenceSet.Builder(  )  
        Evidence( name == "location" && stringValue == "BBB" ) 
        Evidence( name == "code" && stringValue matches ".*code.*" )
    then
        $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));
    end
    

    Here is the code which built 10000 rule drl out of the template

    public static void main(String[] args) throws IOException {
        String template = Resources.toString(Resources.getResource("template.drl"), Charset.defaultCharset());
        
        try (PrintStream file = new PrintStream(new File("test.drl"))) {
            for (int i = 0; i < 10_000; i++)
                file.println(String.format(template, i));
        }
    }
    

    Here is domain classes based on your drl syntax, excluding equals, hashCode and toString, looks strange for me...

    public class Evidence {
        public String name;
        public String stringValue;
        
        public Evidence() {
        }
        
        public Evidence(String name, String stringValue) {
            this.name = name;
            this.stringValue = stringValue;
        }
        
        public static Builder newBuilder() {
            return new Builder();
        }
    }
    
    public class EvidenceSet {
        public static Set<Evidence> evidenceSet = new HashSet<>();
        
        public static class Builder {
            public Evidence evidence = new Evidence();
            
            public Builder setName(String name) {
                evidence.name = name;
                return this;
            }
            
            public Builder setStringValue(String stringValue) {
                evidence.stringValue = stringValue;
                return this;
            }
            
            public void addEvidences(Builder builder) {
                evidenceSet.add(builder.evidence);
            }
        }
    }
    

    Here is the test which executes 10k rules file

    @DroolsSession("classpath:/test.drl")
    public class PlaygroundTest {
        
        private PerfStat firstStat = new PerfStat("first");
        private PerfStat othersStat = new PerfStat("others");
        @Rule
        public DroolsAssert drools = new DroolsAssert();
        
        @Test
        public void testIt() {
            firstStat.start();
            drools.insertAndFire(new EvidenceSet.Builder(), new Evidence("location", "BBB"), new Evidence("code", "thecode"));
            firstStat.stop();
            
            for (int i = 0; i < 500; i++) {
                othersStat.start();
                drools.insertAndFire(new Evidence("code", "code" + i));
                othersStat.stop();
            }
            
            System.out.printf("first, ms: %s%n", firstStat.getStat().getAvgTimeMs());
            System.out.printf("others, ms: %s%n", othersStat.getStat().getAvgTimeMs());
            System.out.println(EvidenceSet.evidenceSet);
        }
    }
    

    The rule and the test should fit your requirements to match all decision nodes in the rule. Agenda group and salience has nothing to do with performance, skipped here for simplicity.

    The test without 'for' section, gives following profiling result profile1.png

    The 'winner' is org.drools.core.rule.JavaDialectRuntimeData.PackageClassLoader.internalDefineClass which creates and registers auto generated java class for drools then block, which is heavyweight operation.

    The test with 'for' section, gives totally different picture profile2.png

    Notice internalDefineClass stays the same, and test code execution becomes notable. Because classes of 10k rules were loaded and that took 12,5 secs on my 6 core (12 logical) CPU machine. After that all other insertions were executed in avg. 700 ms. Let analyze that time.

    Then block of the rules is executed in avg 0.01 ms.

    RULE 9::product - min: 0.00 avg: 0.01 max: 1.98 activations: 501
    

    10000 RHS * 0.01 ms = 100 ms is the time taken by 10k rules to execute the following

    $outputs.addEvidences(Evidence.newBuilder().setName("Product").setStringValue("777"));
    

    Dummy object creation and manipulation becomes notable because you multiple it to 10k. Most of the time it was printing test runner output to the console, according to second performance profiling results. And total 100 ms. RHS execution is quite fast taking into account you rude objects creation to serve 'the builder' paradigm.

    To answer your question, I would guess and suggest following issues:

    1. You may not reuse compiled rules session.

    2. It can be underestimated then block execution time, if you have dummy String.format somewhere in RHS it will be notable for the overall execution time just because format is comparably slow and you are dealing with 10k executions.

    3. There could be cartesian rules matches. Just because you are using set for results, you can only guess how many 'exactly the same results' were inserted into that set, implying huge execution waste.

    4. If you are using statefull session I don't see any memory cleanup and object retraction. Which may cause performance issues before OOME. Here is memory footprint of the run...

    profile3


    day2... cartesian matches

    I removed logging and made a test for cartesian matches

    @Test
    public void testIt() {
        firstStat.start();
        drools.insertAndFire(new EvidenceSet.Builder(), new Evidence("location", "BBB"), new Evidence("code", "thecode"));
        firstStat.stop();
        
        for (int i = 0; i < 40; i++) {
            othersStat.start();
            drools.insertAndFire(new Evidence("location", "BBB"), new Evidence("code", "code" + i));
            othersStat.stop();
        }
        
        drools.printPerformanceStatistic();
        System.out.printf("first, ms: %s%n", firstStat.getStat().getAvgTimeMs());
        System.out.printf("others, ms: %s%n", othersStat.getStat().getAvgTimeMs());
        System.out.println(EvidenceSet.evidenceSet);
    }
    

    Notice we are inserting new Evidence("location", "BBB") each iteration. I was able to run the test with 40 iterations only otherwise I end-up with OOME (something new to me). Here is CPU and memory consumption for 40 iterations:

    prifile4

    Each rule was triggered 1681 times! RHS avg. time is not notable (optimized and removed from execution?), but when block was executed...

    RULE 999::product - min: 0.00 avg: 0.00 max: 1.07 activations: 1681
    RULE 99::product - min: 0.00 avg: 0.00 max: 1.65 activations: 1681
    RULE 9::product - min: 0.00 avg: 0.00 max: 1.50 activations: 1681
    first, ms: 10271.322
    others, ms: 1614.294
    

    After removing cartesian match, the same test is executed much faster and without memory and GC issue

    RULE 999::product - min: 0.00 avg: 0.04 max: 1.27 activations: 41
    RULE 99::product - min: 0.00 avg: 0.04 max: 1.28 activations: 41
    RULE 9::product - min: 0.00 avg: 0.04 max: 1.52 activations: 41
    first, ms: 10847.358
    others, ms: 102.015
    

    profile5

    Now I can increase iteration count up to 1000 and average execution time of iteration is even less

    RULE 999::product - min: 0.00 avg: 0.00 max: 1.06 activations: 1001
    RULE 99::product - min: 0.00 avg: 0.00 max: 1.20 activations: 1001
    RULE 9::product - min: 0.00 avg: 0.01 max: 1.79 activations: 1001
    first, ms: 10986.619
    others, ms: 71.748
    

    Inserting facts without retraction would have it limits though

    profile5

    Summary: you need to make sure you don't get cartesian matches.


    Quick solution to omit any cartesian matches is to keep unique Evidences. Following is not 'common approach' of doing this, but it will not require to change and add even more complexity to rules conditions.

    Add a method to Evidence class

    public boolean isDuplicate(Evidence evidence) {
        return this != evidence && hashCode() == evidence.hashCode() && equals(evidence);
    }
    

    Add a rule with highest salience

    rule "unique known evidence"
    salience 10000
    when
        e: Evidence() 
        Evidence(this.isDuplicate(e))
    then
        retract(e);
    end
    

    This was tested and appeared to be working by previous junit test which reproduced cartesian issue. Interestingly, for the very test (1000 iterations, 10000 rules), isDuplicate method was invoked 2008004 times and total time taken by all invocations was 172.543 ms. on my machine. RHS of the rule (event retraction) took at least 3 times longer than other rules (evidences collection).

    RULE 999::product - min: 0.00 avg: 0.00 max: 1.07 activations: 1001
    RULE 99::product - min: 0.00 avg: 0.01 max: 1.81 activations: 1001
    RULE 9::product - min: 0.00 avg: 0.00 max: 1.25 activations: 1001
    unique known evidence - min: 0.01 avg: 0.03 max: 1.88 activations: 1000
    first, ms: 8974.7
    others, ms: 69.197