if I write a custom Load Function with the constructor
MyLoadFunction(String someOptions, DataBag myBag)
How can I execute this function with piglatin?
X = load 'foo.txt' using MyLoadFunction('myString', myBagAlias);
this does not work, is it even possible?
thanks
I'm not sure your need is suitable for Pig. Pig is all about loading up a lot of data and then putting that data through a pipeline. It sounds like you want something more procedural, to load a small amount of data, do some processing, make a decision based on that, and follow that algorithm to completion.
So I'm not sure this is the best way for you to go, but you can try writing a UDF that will access HBase and grab the data you need. LOAD
is inappropriate here because LOAD
does not return a bag, it returns a relation that Pig expects you to put through some transformations. But you can pass a bag as input to a UDF, and then inside that UDF to do the HBase lookup and processing you want to do.
A more Pig-ish way of doing things would be to load all of the relevant HBase data into one or more relations, and then do a JOIN
as appropriate to combine the pieces of data you want together.