Search code examples
javascriptmarklogic

Why is this database trigger being executed twice by one commit?


I have developed a data processing engine in MarkLogic to handle data being exported into another VPC (virtual private cloud). I am not going to get into why we need this particular solution, I am aware of MLCP, etc. The data has to go through an ETL and schema validation process before offloading to a third party service to handle everything else.

Here is the basic design of the engine:

  1. A document in the exportable collection (in the content database) is created or modified
  2. One of two triggers will be fired, depending on whether the document is created or modified
    • Each trigger executes the same module exportDataTrigger.sjs
    • Create trigger event details:
      • "collection-scope": { "uri": "exportable" }
      • "document-content": { "update-kind": "create" }
      • "when": "post-commit"
    • Modify trigger event details:
      • "collection-scope": { "uri": "exportable" }
      • "document-content": { "update-kind": "modify" }
      • "when": "post-commit"
  3. A copy of the document is added to export database with the export collection by the exportDataTrigger.sjs module
  4. A trigger is fired when a document is created with the export collection
    • The trigger executes the processExportData.sjs module
    • Create trigger event details:
      • "collection-scope": { "uri": "export" }
      • "document-content": { "update-kind": "create" }
      • "when": "post-commit"
  5. The processExportData.sjs module performs the following operations on the document:
    • Performs some ETL on the document content
    • Replaces the export collection with processed collection
    • Inserts the modified document via xdmp.documentInsert
      • Example: xdmp.documentInsert(uri, processedDocument, xdmp.documentGetPermissions(uri), 'processed');
  6. A trigger is fired when a document is modified with the processed collection:
    • The trigger executes the validateData.sjs module
    • Modify trigger event details:
      • "collection-scope": { "uri": "processed" }
      • "document-content": { "update-kind": "modify" }
      • "when": "post-commit"
  7. The validateData.sjs module performs the following operations on the document:
    • Data is compared against a JSON schema:
      • When data is valid:
        • An existing property validated is set to true
        • Replaces the processed collection with export-ready collection
        • Inserts the modified document via xdmp.documentInsert
          • Example: xdmp.documentInsert(uri, processedDocument, xdmp.documentGetPermissions(uri), 'export-ready');
      • When data does not match the schema:
        • An existing property validated is false by default
        • Replaces the processed collection with needs-review collection
        • Inserts the modified document via xdmp.documentInsert
          • Example: xdmp.documentInsert(uri, processedDocument, xdmp.documentGetPermissions(uri), 'needs-review');

Everything works fine until the trigger on step #6. It gets executed twice, even though the document is only updated once. I added a lot of logs throughout the code to check each transaction. I also checked to make sure I didn't actually add the trigger twice. I currently have my q-console set up to skip straight to step #3.

What would cause a trigger to fire twice on one update?

Edit:

I decided to comment out the code logic in the validateData.sjs module and just left the logs. It turns out it is triggering itself, an extra time. I still don't understand why, since I am committing the document once, with a new the trigger should not be fired on (either export-ready or needs-review)


Solution

  • Post-commit triggers are spawned and executed as separate transaction from the one that updated the document.

    You didn't show how the triggers were created or what options were set. Was the recursive parameter set to false?

    https://docs.marklogic.com/trgr.createTrigger

    recursive Set to true if the trigger should be allowed to trigger itself for recursive changes on the same document. Set to false to prevent the trigger from triggering itself. If this parameter is not present, then its value is true.

    If not, then changes made to the document could trigger the modified trigger.

    Avoiding Infinite Trigger Loops (Trigger Storms)

    ...you can avoid trigger storms by setting the $recursive parameter in the trgr.createTrigger() function to fn:false().