regex split apache-camel tokenize spring-camel

Apache Camel Split by start and end characters SOH and ETX

I have an spring boot application which have routes.xml being loaded on startup

On the routes.xml, i have a MQ queue source that contains sample message

SOH{123}{345}{4
5
6
}ETXSOH{111}{222}{3
3
3
}ETX

where SOH = \u0001 and ETX = \u0003

When i receive this message, i want to split the message to two

{123}{345}{4
5
6
}

and

{111}{222}{3
3
3
}

Currently i am trying to split using

<split>
  <tokenize token="(?s)(?&lt;=\u0001)(.*?)(?=\u0003)" regex="true"/>
  <to uri="jms:queue:TEST.OUT.Q" />
</split>

I have tested this regex using online regex tester and it was matching. https://regex101.com/r/fU5VVj/1

But when runnning the code what i am geting is #1

SOH

ETXSOH

ETX

Also tried the token and endToken but not working for my case

<tokenize token="\u0001" endToken="\u0003" />

Is my case possible using camel route xml? If yes, can you point me to correct regex or start and end token.

Thanks

Solution

Seems camel regex is different with java regex, just created a new process using sample code below

    Pattern p = Pattern.compile("(?s)(?<=\\u0001).*?(?=\\u0003)");
    Matcher m = p.matcher(items);
    List<String> tokens = new LinkedList<>();

    while (m.find()) {
        String token = m.group();
        System.out.println("item = "+token);
        tokens.add(token);
    }