Search code examples
hadoopapache-piguser-defined-functionsudf

ERROR 1070 Apache Pig, using built-in UDF


This, this, and this, did not solve my problem. They all are making their own UDFs. I want to use a built-in UDF. Any built-in UDF. I get the same or similar error for every UDF I have tried.

 FOO = LOAD 'filepath/data.csv' 
 USING PigStorage(',') 
 AS (name:string, age:int, kilograms:double);

 BAR = FOREACH FOO GENERATE $0, $1, $2, kilograms*2.2 AS pounds;

This works as expected, basically creating the same relation as FOO but with an extra column that has KG converted to LBS.

But if I want to use something like get the log scale of kilograms, like this:

 BAR2 = FOREACH FOO GENERATE $0, $1, $2, log(kilograms) AS logscaleKG;

I get the following error (or similar):

 ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve log using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]

No UDF seems to work inside a FOREACH GENERATE.


Solution

  • Pig is a bit finicky about capitalization, you need to capitalize log. For example, I can run this code just fine on a fresh Hortonworks Sandbox.

    $ hdfs dfs -cat /tmp/kg.csv
    one,1
    two,2
    three,3
    

    +

    grunt> a = LOAD '/tmp/kg.csv' USING PigStorage(',') AS (txt:chararray, val:int);
    grunt> b = FOREACH a GENERATE txt, val, LOG(val);
    grunt> DUMP b;
    ... # Running some MapReduces
    (one,1,0.0)
    (two,2,0.6931471805599453)
    (three,3,1.0986122886681098)