please before I get hammered because there are other stackoverflow questions like mine but they do not work. I am trying to remove unwanted characters from an inbound message without success. I do not know what the characters are or represent, however, they seem to break the data up like carriage returns and line feeds or new lines. I need to keep all the spaces except those at the end. the characters I see are ^M and ^C. sometimes used together, and alone.
my test code basically from observing other similar questions.
String msg = exchange.getIn().getBody(String.class);
log.info("Message before apply filter: " + msg);
filteredMessage = msg.replaceAll("[^\\x00-\\x7F]","");
log.info("Remove non-ASCII characters: " + filteredMessage);
filteredMessage = msg.replaceAll("[\\p{C}]","");
log.info("Remove all Control characters: " + filteredMessage);
filteredMessage = msg.replaceAll("[\\p{Cntrl}\\p{Cc}\\p{Cf}\\p{Co}\\p{Cn}]","");
log.info("Remove some Control characters: " + filteredMessage);
filteredMessage = msg.replaceAll("[^\\p{Print}]","");
log.info("Remove non printable characters: " + filteredMessage);
filteredMessage = msg.trim();
log.info("Trim: " + filteredMessage);
filteredMessage = msg.replaceAll("\\cM","");
log.info("Remove ^M Control characters: " + filteredMessage);
filteredMessage = msg.replaceAll("^M","");
log.info("Remove ^M Control characters: " + filteredMessage);
exchange.getIn().setBody(filteredMessage);
Sample data files:
A 291511191831421742XXXXXXXXXXWRN/WN18111917420077000009ENG 2 IGN B FAULT^M ^C
A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
A 080011191830061749XXXXXXXXXXMPF/AN.N306DN/FIDAL800 /DM181119142800/DAKSMF/DSKMSP/WN18111916310034000006NAV ATC/XPDR 1 FAULT^M,18111917480032000009BRAKES HOT^M/FR18111916310034523306ATC 1(1SH1)^M/IDATC 1^M/FR18111916310034723406ATC1(1SH1)/TCAS(1000SG)^M/IDTCAS^M/FR18111917120022833406AFS:FMGC2^M/IDAFS 1^M,IR 1^M,IR 2^M,IR 3^M/FR18111917120022833406AFS:FMGC1^M/IDAFS 1^M,IR 1^M,IR 2^M,IR 3^M ^C
My filters are not working. here is the results. it's like the regex is not working at all or I'm doing something silly. thanks everyone!
Message before apply filter: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove non-ASCII characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove all Control characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove some Control characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove non printable characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Trim: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove ^M Control characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Remove ^M Control characters: A 056611191832641742XXXXXXXXXXFLR/FR18111917410032470002BRK TEMP SENSOR4(6GW)/ BTMU(2GW)^M/IDBSCU 1^M ^C
Try the following and see how it goes:
String msg = exchange.getIn().getBody(String.class);
log.info("Message before apply filter: " + msg);
filteredMessage = msg.replaceAll("[^\\x00-\\x7F]","");
log.info("Remove non-ASCII characters: " + filteredMessage);
filteredMessage =filteredMessage.replaceAll("[\\p{C}]","");
log.info("Remove all Control characters: " + filteredMessage);
filteredMessage = filteredMessage.replaceAll("[\\p{Cntrl}\\p{Cc}\\p{Cf}\\p{Co}\\p{Cn}]","");
log.info("Remove some Control characters: " + filteredMessage);
filteredMessage = filteredMessage.replaceAll("[^\\p{Print}]","");
log.info("Remove non printable characters: " + filteredMessage);
filteredMessage = filteredMessage.trim();
log.info("Trim: " + filteredMessage);
filteredMessage = filteredMessage.replaceAll("\\cM","");
log.info("Remove ^M Control characters: " + filteredMessage);
filteredMessage = filteredMessage.replaceAll("^M","");
log.info("Remove ^M Control characters: " + filteredMessage);
After this, if you still see characters you dont want, you need to use the escape character \
to escape backslashes and other metacharacters:
<>()[]{}\^-=$!|?*+.
So you need, for example : .replaceALl("\^M","");