Search code examples
hadoopapache-pig

Where can I learn to use Pig without having to set up Hadoop


My goal is to learn Pig in order to enhance my resume for machine learning/statistical analysis jobs. I am not really interested in all of the nitty gritty Hadoop details at the moment (although I would love to learn them later-it has just been very difficult to set up on my machine even with instructions, I'm more a stats guy than a programmer). Is there some resource where I could learn Pig, and have easy access to it to for experimentation, without having to learn Hadoop from the ground up?


Solution

  • Yes. Install pig and then run it locally. It can do everything locally (albeit in most cases more slowly) that it can do over hadoop.

    For the interactive shell (grunt):

    pig -x local
    

    To run a pig script locally:

    pig -x local some_script.pig
    

    The best docs on how to use pig are over at Apache, and they've got a pretty good tutorial as well.