Using Elephantbird JsonLoader I'm able to load the data if the record is in this format:
{"disknum":36,"disksum":136.401,"disk_rate":1872.0,"disk_lnum": 13}
but the actual data is in the below format: (enclosed in square brackets)
[{"disknum":36,"disksum":136.401,"disk_rate":1872.0,"disk_lnum": 13}]
When I try to parse this it doesn't throw error nor it gives any useful output. It shows success! and 0 records read and 0 records written.
Please advice how to handle the data with square parenthesis.
below is my syntax for non square bracketed records:
register '/home/data/Desktop/elephantbird/elephant-bird-core-4.1.jar';
register '/home/gopal/Desktop/elephantbird/elephant-bird-hadoop-compat-4.1.jar';
register '/home/gopal/Desktop/elephantbird/elephant-bird-pig-4.1.jar';
register '/home/gopal/Desktop/elephantbird/json-simple-1.1.jar';
a = LOAD '/pig/tc1.log' USING com.twitter.elephantbird.pig.load.JsonLoader() as (json:map[]);
b = FOREACH a GENERATE flatten(json#'node_disk_lnum_1') AS node_disk_lnum_1, flatten(json#'node_disk_xfers_in_rate_sum') AS node_disk_xfers_in_rate_sum, flatten(json#'node_disk_bytes_in_rate_22') AS node_disk_bytes_in_rate_22, flatten(json#'node_disk_lnum_7') AS node_disk_lnum_7;
dump b;
Please advice! Thanks in advance :)
I think this might help : see solution , its pretty close. Json parse with elephantbird in Pig
You need to provide a rootname to your json.