I have a file receive location with edireceive pipeline configed to receive incoming HIPPA 5010 837 files.
The normal incoming file size is 4 to 6 megabytes, contains 3K to 5K records. The 837 schema deployed is the "multiple" version which have the subdocument_break="yes". So the file been processed will generate 3K to 5K messages per file.
The pipeline works fine and can split the file into multiple messages as expected. for 1 single file, BizTalk takes less than 5 mins to process.
The problem is when more than 10 files was put to the incoming folder at same time, Biztalk will start process these files parallel. But it will take hours to process these files and the BizTalk Host consumes more than 10G memory.
Some other info:
My question is: Is this performance normal? how can I improve it?
You are correct, 5K message is not really the issue, it 5 batchs of 5K message at the same time that's causing the problem.
To serialize the debatching you can use an Ordered Delivery Two Way Send Port with an Loopback Adapter which debatches the EDI on the Receive Side. In this case, the initial Receive Location would be a PassThrough.
You can find several Loopback Adapters here: http://social.technet.microsoft.com/wiki/contents/articles/12824.biztalk-server-list-of-custom-adapters.aspx#jjj