Search code examples
hadoopclojureapache-pigcascadingcascalog

Clojure Hadoop - 5 Lines of Cascalog equivalent to 300 lines of PIG?


In this presentation at slides 36 and 37 - the author of Cascalog asserts that given a data set of names and ages like: [name age] that the query to return all the results that are greater than the average age is 300 lines of PIG.

Is this a valid assertion? How many lines of PIG is it really?

Or is the problem he's describing bigger than what I've described?

(Disclaimer - I'm a big fan of Nathan's work, of Clojure and Cascalog - I'm just trying to get some facts straight).


Solution

  • You've done a misinterpretation of what he says in this presentation. What he means is that the implementation de "average" in PIG is 300 lines de java code, versus the 5 lines of cascalog implemented by macro predicate functionality. He wants to emphasize the power of the composition.

    PD: Sorry for my bad english, I'm learning ;-)