I am attempting to write and read Uids from Accumulo Value (key,Value) into Uid.List using protobuf. Specifically: org.apache.accumulo.examples.wikisearch.protobuf.Uid;import org.apache.accumulo.examples.wikisearch.protobuf.Uid.List.Builder
I use the following code to write Uid.List where I declare UidListCount as #of uids in List Cseq:
Builder uidBuilder = Uid.List.newBuilder();
uidBuilder.setIGNORE(false);
for String entry : seq){
uidBuilder.addUID(entry);
}
Uid.List uidList = uidBuilder.build();
Value newAccumuloValue = new Value(uidList.toByteArray());
This seems to work fine.
When I Try to read the Uid.List out of accumulo value,where value is a protobuf Uid.List, its a no-go:
byte[] byteVal = value.getBytes; //retrieving Accumulo Value containing Uid.List
Uid.List uids= Uid.List.parseFrom(byteVal);
while (counter <= counter){
String uidStr = uids.getUID(counter).toString();
system.out.println(uidStr);
}
I keep getting "tag errors"
I would really like to understand how to read out what goes in.
Thanks!
I would suggest changing the second bit of code to something along the lines of:
byte[] byteVal = value.getBytes;
Uid.List uids= Uid.List.parseFrom(byteVal);
int count = uids.getUIDCount();
for (int i = 0; i< count; i++){
String uidStr = uids.getUID(i).toString();
system.out.println(uidStr);
}
This code does work as long as the UIDs in your list are properly cleaned before the list is built by protobuf. If you have characters within the data (such as unicode nulls) that are used by protobuf as part of the list format then, when parsing the data back out, it is going to break because data characters will be recognized as format characters that don't properly match the data schema. I would start by taking a look at your data and ensuring that it meets data quality and cleanliness standards for what you are trying to achieve.