I am trying to build a query layer on 'azkaban' Database. (Language used: Java) I am running into, what I thought would be a simple problem (but turning out to be irritating).
This is the query I am running:
select exec_id, CONVERT(log USING latin1)from execution_logs
"log" is a Column of type 'longblob'
This is my Java code for reading "log"s:
try {
Connection conn = AzkabanClient.getPhoenixConnection(conf);
String s = " select exec_id, log from execution_logs ";
PreparedStatement pstmt = conn.prepareStatement(s);
ResultSet rs = pstmt.executeQuery();
String logString="";
while(rs.next()){
int i = rs.getInt("exec_id");
InputStream inputStream = rs.getBinaryStream("log");
java.io.BufferedReader in = new BufferedReader(new java.io.InputStreamReader(inputStream));
String str;
while ((str = in.readLine()) != null) {
logString += str;
}
inputStream.close();
}
conn.close();
}catch(Exception e){
LOGGER.error("Error =>" + e);
}
The problem here is: By the end of while loop, I am able to read the 'log' for a row in the table, but the String is unreadable (encoded?)
eg:
logString = "‹Å\]Ç•}^ÿ>°]ÕÕÝUzY‰”Uà8Žììbg¦¥..."
I have tried to modify the query like this:
"select exec_id, CONVERT(log using latin1) from execution_logs"
But still the same issue.
I tried "utf8", but when I do that, I get NULL in column "log" of the ResultSet.
Please help if somebody has experience this or know how to go about solving this?
In the meanwhile, I will keep trying things.
Thanks
Still trying: I am using xampp now (just to do quick prototyping).
In phpmyadmin UI, when I click on the blob, it downloads a '.bin' file. On mac, I can open this file and see proper "English" words (or logs in english), as expected.
But how to do this programatically?
So after a lot of digging into azkaban, I found that this is how one should query for LongBlobs in azkaban-Database:
public String getErrorLog(){
String returnString = "";
try {
Connection conn = AzkabanClient.getPhoenixConnection(conf);
String s = " select exec_id, enc_type, log from execution_logs where exec_id = 3964 and name = 'http-time-series-hourly' ";
PreparedStatement pstmt = conn.prepareStatement(s);
ResultSet rs = pstmt.executeQuery();
while (rs.next()) {
int i = rs.getInt("exec_id");
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
EncodingType encType = EncodingType.fromInteger(rs.getInt("enc_type"));
int debug = 0;
byte[] data = rs.getBytes("log");
try {
byte[] buffer = data;
ByteArrayOutputStream byteArrayOutputStream = null;
if (encType == EncodingType.GZIP) {
byteArrayOutputStream = GZIPUtils.unGzipBytesOutputStream(data);
}
returnString = new String(byteArrayOutputStream.toByteArray(), "UTF-8");
} catch (IOException e) {
throw new SQLException(e);
}
}
conn.close();
} catch (Exception e) {
LOGGER.error("Error =>" + e);
}
return returnString;
}
where :
GZIPUtils is:
public class GZIPUtils {
public static ByteArrayOutputStream unGzipBytesOutputStream(byte[] bytes) throws IOException {
ByteArrayInputStream byteInputStream = new ByteArrayInputStream(bytes);
GZIPInputStream gzipInputStream = new GZIPInputStream(byteInputStream);
ByteArrayOutputStream byteOutputStream = new ByteArrayOutputStream();
IOUtils.copy(gzipInputStream, byteOutputStream);
return byteOutputStream;
}
}
and EncodingType:
public static enum EncodingType {
PLAIN(1), GZIP(2);
private int numVal;
EncodingType(int numVal) {
this.numVal = numVal;
}
public int getNumVal() {
return numVal;
}
public static EncodingType fromInteger(int x) {
switch (x) {
case 1:
return PLAIN;
case 2:
return GZIP;
default:
return PLAIN;
}
}
}