Search code examples
spring-bootreactive-programmingmarklogicmarklogic-9

Reactive approach of Marklogic database


Does Marklogic supports backpressure or allow to send data in chunks that is reactive approach ?


Solution

  • 'Reactive' is a fairly new term describing a particular incarnation of old concepts common in server and database technologies, but fairly new to modern client and middle-tier programming.

    I am assuming the question is prompted by the need/desire to work within an existing 'Reactive' framework (such as vert.x or Rx/Java). For that question, the answer is 'no' - there is not an 'official' API which integrates directly with these frameworks to my knowledge. There are community APIs which I have not personally used, an example is https://github.com/etourdot/vertx-marklogic (reactive, vert.x marklogic API).

    MarkLogic is a 'reactive' design internally in that it implements the functionality the modern 'reactive' term is used to describe -- but does not expose any standard 'reactive APIs' for this (there are very few standards in this area). Code running within MarkLogic server (xquery,javascript) implicitly benefits from this - although there is not an explicit backpressure API, a side effect of single threaded blocking IO (from the app perspective) is that the equivalent of 'back pressure' is implemented by implicit flow control of the IO APIS - you cannot over drive a properly configured ML server on a single thread doing blocking IO. Connections to an overloaded server will take longer and eventually time out ('backpressure' :)

    Similarly, (most of) the external APIs (REST, XCC) are also blocking, single threaded. The server core manages rate control via a variety of methods such as actively managing the TCP connection queue size, keep alive times, numbers of active threads etc.

    In general the server does a very good job at this without explicit low level programming needed, balancing the latency across all clients. If this needs improving, the administration guides have good direction on how to tune the various parameters so the system behaves well on its own.

    If you want to implement a per-connection client aware 'reactive' API you will need to implement it yourself. This can be done using the same techniques used for other blocking IO APis -- i.e. either use multiple threads or non-blocking IO. Some of the ML SDK's have provision for non-blocking IO or control over timeouts which can be used to implement a 'reactive' API.

    Similarly, code running in the server itself (XQuery or JavaScript) can implement 'reactive' type behaviour by making use of the task queue -- as exposed by the xdmp:spawn-xxx apis. This is done in many libraries to manage bulk ingest. Care must be taken to carefully control the amount of concurrency as you can easily overload the server by spawning too many concurrent requests. Managing state is a bit tricky as there is a interaction/opposition between the transaction model and task creation -- the former generally presenting an idempotent view of data that can be incongruous with the concept of 'current' wrt asynchronous tasks.