Search code examples
streamingbiztalkpipelinebiztalk-orchestrations

Biztalk - How to throttle a streaming disassemble pipeline


I need to limit the number of orchestration instances spawned while debatching a large message in a streaming disassemble receive pipeline. Let’s say that I have a large xml coming in that contains 100 000 separate "Order" message. The receive pipeline would then debatch it and create 100 000 "ProcessOrder" orchestrations. This is too much and I need to limit that.

Requirements

  1. The debatching needs to be done in a streaming manner so that I only load one "Order" message in memory at a time before sending it to the messagebox;
  2. The debatching needs to be throttled based on the number of current running "ProcessOrder" orchestration instances (say if I already have 100 running instances, the debatching would wait till one is over to send another "Order" message to the messagebox).

Where I'm at

  1. I have the receive pipeline that does the debatching and functional modifications to my messages. It does what it should in a streaming manner and puts individual messages in VirtualStreams;

  2. I have an orchestration and helper methods that can limit the number of “ProcessOrder” orchestration instances.

The problem

I know that I can run a receive pipeline inside an orchestration (and that would solve my problem since on every "getnext" call to the pipeline, I could just hold on if there are too many running orchestration instances) but, digging in biztalk dlls, I noticed that using Microsoft.XLANGs.Pipeline.XLANGPipelineManager still loads up all the messages in memory instead of enumerating them like Microsoft.BizTalk.PipelineOM.PipelineManager does. I know they are putting every messages in VirtualStream but this is still inadequate, memory wise, for such a large message number.

Question

My next step would be to run the receive pipeline directly in the receive port (so it would use Microsoft.BizTalk.PipelineOM.PipelineManager) without having the orchestration that limits the number of “ProcessOrder” instances, but to meet the requirements, I would need to add a delay logic in my pipeline. Is this a viable option? If not, why? and what other alternative do I have?


Solution

  • You should debatch all messages once from pipeline and store those individual messages in MSMQ before even they are processed by orchestration. Use standard pipeline to debatch messages as they are efficient to handle large files debatching. MSMQ is available for free through Turn On Windows Features. Using MSMQ is very easy and does not require any development. Sending to MSMQ will be very fast 100K messages is not issue at all.

    Then have a receive location to read from MSMQ. Depending on your orchestration throughput, you can control message flow by using BizTalk receive host throttling or by receiving the messages from MSMQ in Order or using the combination of both. Make sure you have separate host instance for both receive MSMQ and send MSMQ and for your orchestration processing.

    This will be done through all configurations without any extra code simplifing your design. Make sure you have orchestration with minimum number of persistent points.