I have two data objects in pig.
data_1:
col_a: chararray,
col_b: int,
col_c: int,
col_d: chararray
data_2:
col_a: chararray,
col_b: chararray,
col_c: int,
col_d: int,
col_e: int
I want to join the two of them, I tried:
all_data = JOIN data_1 BY (col_a) LEFT, data_2 by (col_b);
all_data = JOIN data_1 BY (col_a), data_2 by (col_b);
When I tried to dump the object (after limit it to 10 records) Both options gave back the same error:
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: all_data_limit: Limit - scope-6383 Operator Key: scope-6383): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: all_data: New For Each(true,true)[tuple] - scope-6382 Operator Key: scope-6382): org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.ClassCastException: org.apache.pig.impl.io.NullableText cannot be cast to org.apache.pig.impl.io.NullableBytesWritable
I'm getting a bit frustrated, couldn't find a solution to this and I'm searching for one for 3 days now... Any help would be great. Thanks!
use below commands
all_data = JOIN data_1 BY TRIM(col_a) LEFT, data_2 by TRIM(col_b);
all_data = JOIN data_1 BY TRIM(col_a), data_2 by TRIM(col_b);
let me know if it'd worked without an error.