Search code examples
powershellmsmq

Faster way to delete up to the 4th colon in a variable?semicolon


I'm trying to query MSMQ with Powershell, and then read (in my case) XML out of the BodyStream. Stripping out the crap before the XML is slow... like 200-500ms per record.

Pulling the XML out seems okay fast (below), but that replace takes forever.

$msmq = Get-MsmqQueue -Name "myqueuelovesme"
$var = Receive-MsmqQueue -input $msmq -Peek -Count 500 -Timeout 5000 -RetrieveBody | foreach {
$sr = $_.BodyStream.ToArray()
$pr = $enc.GetString($sr)
[xml]$clean= $pr -creplace ".*:Message:"
select-xml -xml $clean -XPath "/MyXML/Defin" | foreach {$_.node}
}
$var

The contents of bodystream look like this:

F:01.02.03:1234567890:Message:<MyXML><Defin xmlns:dsi="http:// [...]

The XML always starts after that 4th semicolon, so I don't think I can split/join (as the number of semicolons in the XML can change). I see that creplace is faster than replace, but neither is great.

Any suggestions? (And heck, is the rest of the code okay?)


Solution

  • Tonnes of options I imagine here. Not sure which would be the best one. I will try and add some as we go. For sure other solutions will show up. You said you don't think you can split/join well.... BLAM!

    'F:01.02.03:1234567890:Message:<MyXML><Defin xmlns:dsi="http:// [...]'.Split(":",5)[-1]
    

    Split the string by colons but only up to 5 elements. Then return only the last element which would be <MyXML><Defin.... You would apply this to your implementation as:

    [xml]$clean= $pr.Split(":",5)[-1]
    

    You could also just split on the first < using the same logic.

    [xml]$clean= "<$($pr.Split("<",2)[-1])"
    

    Since we are splitting on the < we just add it back to the string.

    I did a Measure-Command test using an indexOf and Split. Running both 100000 times took about 2 seconds each. They were both neck and neck. Same test with -creplace took about 7 seconds. Measure-Command is not perfect and its results can be questioned but it would seem either answer here (so far) would perform better.