Search code examples

Difference between Drools engine and Database

I was going through Drools documentation and found it is not doing anything interesting / solving any problems (May be I'm wrong).

In drools we specify the business rules (in .drl file) as, for example,

  when "type = jewellery" then setDiscount(25%)
  when "type = KidDress" then setDiscount(30%) 
  1. What is the difference between the above vs using database?

  2. I can ALWAYS expose custom API's from which business rules can be specified and I can directly store it in RDBMS. Formally if required, I can build a sample UI (in 1-2 days) which integrates with exposed APIs. This will also allow business people's to easily add/update/delete rules If I expose CRUD operations.

For something as simple as I explained, what problem is Drools solving? I cannot find in any documentation from g-search / in official documentation.

Can someone help here?


  • In contrast to Karol's answer, I also used Drools but I had an excellent experience with them. The use cases in the documentation are intentionally simplified, but Drools can also handle much more complex use cases more efficiently than a database. I know this for a fact, because a service that I maintained with ~ 1.4 million rules was converted to using a database (using the same arguments you presented). It went from averaging 30-100 ms to respond to a query, to taking 750ms to over 2 minutes to respond (how much longer I do not know because we timed out our queries after 2 minutes.)

    The reason for this was that Drools allowed us to implement "fall through" logic. In this case, my 1.4 million rules were determining if a hospital patient would need authorization from their insurance to have a procedure done at a hospital. The rules ranged from very general to very specific; if two rules matched the input data, we favored the more specific rule. Special use cases applied if a specific hospital or hospital+insurance combination had a custom rule. We passed all the data we knew about the patient in, their entire medical history, and a ton of information about their insurance, and then the rules decided on the answer.

    Imagine this intentionally simplified scenario:

    rule "Car"
      Car() // very simple, I have a car
    rule "Red Car"
      Car( color == "red" ) // I have a red car
    rule "4-door Car"
      Car( doors == 4 ) // I have a 4-door car
    rule "Red Sedan"
      Car( color == "red", model == "sedan") // I have a red sedan
    rule "Blue 4-Door Discount"
      Car( doors == 4, color == "blue") // I have a blue 4-door car

    Now we start playing in different configurations of Car. A yellow car, 2-door sports car only matches the first rule and the price is 100. A red 4-door car matches two rules; is the price 75 or 200? Depends on how you've written your rules and what "set price" does; likely in the rules I've written here the price is 200. A blue sedan? 100. And so on.

    If we converted this into a database table (for simplicity, a single table Car with columns 'color', 'model', and 'doors'), what would that query look like? (I don't actually know I didn't manage to write a query that would suffice; I'm also not a DBA.)

    I could come up with a whole set of examples where a database-based solution would be less performant, or not recommended at all. For example, I once implemented a psuedo-BFS algorithm using rules to figure out an optimal upgrade path from an arbitary hardware configuration to the latest supported configuration. (Each version could only be upgraded to distinct other versions, so we needed to figure out the fastest path from a given version to a target version, if possible.) Could this have been done in a database? Possibly, but it's not the sort of thing a relational db would be good for. What about code? Sure, but now you'd have to manage your list of what-can-upgrade-to-what in code.

    For extremely simple rule sets where each rule is completely exclusive and covers all use cases? Sure a database might be more performant. Real world situations, however, would either require overly complex queries, or might not be appropriate at all.

    And decision tables? Avoid them at all costs. They are slow to load, slow to execute, hog way more memory than they need, have undocumented limitations that you'll run into if trying to use them at scale, and debugging them is a pain.