This, this, and this, did not solve my problem. They all are making their own UDFs. I want to use a built-in UDF. Any built-in UDF. I get the same or similar error for every UDF I have tried.
FOO = LOAD 'filepath/data.csv'
USING PigStorage(',')
AS (name:string, age:int, kilograms:double);
BAR = FOREACH FOO GENERATE $0, $1, $2, kilograms*2.2 AS pounds;
This works as expected, basically creating the same relation as FOO but with an extra column that has KG converted to LBS.
But if I want to use something like get the log scale of kilograms, like this:
BAR2 = FOREACH FOO GENERATE $0, $1, $2, log(kilograms) AS logscaleKG;
I get the following error (or similar):
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve log using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
No UDF seems to work inside a FOREACH GENERATE.
Pig is a bit finicky about capitalization, you need to capitalize log
. For example, I can run this code just fine on a fresh Hortonworks Sandbox.
$ hdfs dfs -cat /tmp/kg.csv
one,1
two,2
three,3
+
grunt> a = LOAD '/tmp/kg.csv' USING PigStorage(',') AS (txt:chararray, val:int);
grunt> b = FOREACH a GENERATE txt, val, LOG(val);
grunt> DUMP b;
... # Running some MapReduces
(one,1,0.0)
(two,2,0.6931471805599453)
(three,3,1.0986122886681098)