I have written Java code to decode a string encoded with "UTF-8". That String was encoded three times. I am using this code in the ETL. so, I can use an ETL step three times in a row, but it will be a little inefficient. I researched over the internet but didn't find anything promising. Is there any way in Java to decode the String encoded multiple times?
Here's my input string "uri":
file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development+Activity/Defects+Unresolved+%252528by+Non-Developer%252529.xanalyzer
Here's my code which is decoding this string:
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.io.*;
String decodedValue;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException {
// First, get a row from the default input hop
//
Object[] r = getRow();
// If the row object is null, we are done processing.
//
if (r == null) {
setOutputDone();
return false;
}
// It is always safest to call createOutputRow() to ensure that your output row's Object[] is large
// enough to handle any new fields you are creating in this step.
//
Object[] outputRow = createOutputRow(r, data.outputRowMeta.size());
String newFileName = get(Fields.In, "uri").getString(r);
try{
decodedValue = URLDecoder.decode(newFileName, "UTF-8");
}
catch (UnsupportedEncodingException e) {
throw new AssertionError("UTF-8 is unknown");
}
// Set the value in the output field
//
get(Fields.Out, "decodedValue").setValue(outputRow, decodedValue);
// putRow will send the row on to the default output hop.
//
putRow(data.outputRowMeta, outputRow);
return true;}
Output of this code is following:
file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development Activity/Defects Unresolved %2528by Non-Developer%2529.xanalyzer
When I run this code in the ETL three times, I get the output I want, which is this:
file:///C:/Users/nikhil.karkare/dev/pentaho/data/ba-repo-content-original/public/Development Activity/Defects Unresolved (by Non-Developer).xanalyzer
URL encoding replaces %
, (
and )
with resp. %25
.%28
and %29
.
String s = "file:///C:/Users/nikhil.karkare/dev/pentaho/data/"
+ "ba-repo-content-original/public/Development+Activity/"
+ "Defects+Unresolved+%252528by+Non-Developer%252529.xanalyzer";
// %253528 ... %252529
s = URLDecoder.decode(s, "UTF-8");
// %2528 ... %2529
s = URLDecoder.decode(s, "UTF-8");
// %28 .. %29
s = URLDecoder.decode(s, "UTF-8");
// ( ... )