Text I'm trying to get:
przełącznica
This is what I actually have (browser might now view it properly - there are two squares instead of "łą"):
przecznica
BLOB:
70 72 7A 65 C5 82 C4 85 63 7A 6E 69 63 61
EDIT: This is what I get from parser
70 72 7A 65 1A 1A 63 7A 6E 69 63 61
ESQL used to parse BLOB:
DECLARE blobMsg BLOB InputRoot.BLOB.BLOB ;
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg DOMAIN ('XMLNSC') NAME 'XMLNSC';
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg.XMLNSC PARSE(blobMsg OPTIONS FolderBitStream CCSID 1208 FORMAT 'XMLNSC');
I have tried CCSIDs: 1208 (UTF8), 912 (ISO-8859-2), 1200(UTF16 I guess): https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_71/nls/rbagsccsidcdepgscharsets.htm
EDIT: Working code:
DECLARE blobMsg BLOB InputRoot.BLOB.BLOB;
DECLARE remove BLOB X'EFBBBF';
DECLARE message BLOB REPLACE(InputRoot.BLOB.BLOB, remove, CAST('' AS BLOB));
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg DOMAIN ('XMLNSC') NAME 'XMLNSC';
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg.XMLNSC PARSE(message OPTIONS FolderBitStream CCSID 05348 FORMAT 'XMLNSC');
Firstly przełącznica by itself is not valid XML and so you'll get an exception when you try to invoke the XMLNSC parser using the code you have outlined. You need to do a CAST instead.
I generated a little test Application/MsgFlow in IIB 10 to illustrate CASTing the BLOB.
The code in ConvertAndParse is
CREATE COMPUTE MODULE ConvertAndParse
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE blobMsg BLOB X'70727A65C582C485637A6E696361';
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg DOMAIN 'XMLNSC';
CREATE LASTCHILD OF OutputLocalEnvironment.Variables.inpMsg.XMLNSC NAME 'AsUtf8' VALUE CAST(blobMsg AS CHAR CCSID 1208);
CREATE LASTCHILD OF OutputRoot DOMAIN 'XMLNSC';
CREATE LASTCHILD OF OutputRoot.XMLNSC.EncodingResponse NAME 'AsUtf8InTag' VALUE CAST(blobMsg AS CHAR CCSID 1208);
CREATE LASTCHILD OF OutputRoot.XMLNSC.EncodingResponse NAME CAST(blobMsg AS CHAR CCSID 1208) VALUE 'As a tag name';
RETURN TRUE;
END;
END MODULE;
When I run a debug session the value put into the LocalEnvironment tree looks like.
And the result of invoking the flow from a browser.
Now let's deal with the which encoding we are looking at. Looking at what I assume is the input BLOB let's see if the BLOB matches up with UTF-8.
70 72 7A 65 C5 82 C4 85 63 7A 6E 69 63 61
UTF-8 is a variable width character encoding that sets the high order bit to indicate two or more bytes. We also want a page that shows the common code points for UTF-8 Complete Character List for UTF-8. Note it's not actually complete.
Looking at the first 4 bytes none of them have the high order bit on
70 72 7A 65
And the aforementioned Character List says that's prze, so far so good.
Then we hit C8 which has the high order bit on. Doing a bit of visual parsing we get two sets of probable two byte character pairs
C5 82
C4 85
Referring to the Character List our two candidate pairs do in fact match the two characters we want and the next six characters which do not have their high order bits on translate to cznica. Looking really good.
Now to eliminate the other candidate encodings, if we can.
UTF-16 uses 2 or 4 bytes to represent each character depending on the Byte Order Mark with prze encoded as
UTF-16BE - CP 1200 - 00 70 00 72 00 7A 00 65
UTF-16LE - CP 1202 - 70 00 72 00 7A 00 65 00
Given that there are not lots and lots of null characters 00 it is reasonable to discount UTF-16.
ISO-8859-2 - CP 912 is a single byte character set and the C5 and C4 code points do not match the two desired characters and thus we can eliminate it.