Search code examples
multithreadingweb-serviceslotus-dominolotusscript

how to limit the number of instances of a Lotuscript Agent consuming WS (high loaded Domino server, detect crash/stalling conditions)


I need to limit the number of instances of an Agent. Is the following code the best technique?

If counter > 50 then exit sub
Counter++
Call WS    '//this could time out after 110 seconds
Counter--

This meta code will be written in LOTUSSCRIPT (the agent is already in LS). Concerning the programmatic aspect of the counter, what is the best approach (in heavily concurrent calls):

  1. Lock document counter doc then save (loop on doc.lock)
  2. save doc and test save conflict (loop on whole get doc if doc.save false, false)
  3. LockID=CreateLock(LockName as String) then update (loop on createLock)
  4. any better idea?

I have already read: Are web services processed sequentially or in parallel? http://www-10.lotus.com/ldd/ddwiki.nsf/dx/sequential-numbering.htm a ten years old still very accurate document http://www.eview.com/eview/volr6.nsf/0/62B3A667117B484385256F3300576ECF/$File/Guide%20to%20Document%20Locking%20SO%20604.pdf

More about the Context: An agent in a cluster Web Domino server is heavily called (300+ called per minutes during peaks). This agent calls a Domino consumer to another system WS.

Sometime the external system has problems and don't return response "in a reasonable amount of time" (I get Time out after about 110 sec).

The problem: when this appends, all the workers (threads of http, determined by Number active threads: in the server document configuration) are waiting for a response. As the Domino server stops to responds! After the stalled agents timed out, the next call to the agent starts to wait for time out… causing a hugh queue of requests until crashing the server.

To prevent the agent to exhaust the resources (the threads/workers in this case) I'm planning to increment a counter at the beginning of the agent and decrement it when the agent finished. At any time I should have the amount of running instance of the agent. A simple test at the beginning of the agent will make it.

N.B. Using wsConsumer.Settimeout( ms) could lower the time the agent waits but can solve the problem.


Solution

  • Rather than get involved in locking a single document containing a counter, perhaps you could use profile documents. I.e.,

    • Choose a NotesDatabase that contains no profile documents for any other purpose.
    • At startup, do set profileDoc = NotesDatabase.getProfileDocument("someProfileName",NotesSession.userName) (If username is not necessarily unique per instance, then use something else).
    • Before calling the WS, do NotesDatabase.CreateNoteCollection(false), and NotesNoteCollection.setProfile(true), and NotesNoteCollection.BuildCollection() and then check NotesNoteCollection.Count.
    • If the count is 1 or more greater than your threshold, then exit with an error indicating "too busy", else go ahead with the WS call
    • And in either case, at termination call profileDoc.removePermanently()

    My thought is that using profile documents would be faster than using ordinary documents due to the caching that is done. The downside might be that the caching might make the count unreliable for this purpose, but I don't know if that's the case.