Search code examples
apache-pig

Apache pig How to select all columns from one relation when performing a join operation


Say I have two relations:

Relation1:

Col1......Col100, id

Relation2:

R2Col1, R2Col2, R2Col3, id

Now I am trying to do something like:

Relation3 = Join Relation 1 BY id, Relation2 BY id USING 'replicated';

In this case relation3 will become: Relation 3:

Col1......Col100, id, R2Col1, R2Col2, R2Col3, id

I am wondering if there's a way to select columns from relation 1 only. There's many columns so it's not ideal to hardcode them. Ideally I am looking for something equivalent the SELECT relation1.* in SQL. Thanks a lot!


Solution

  • Yes, you can use positional notation with .. to get all the fields from relation 1.$0.. means generate all fields starting from the first field $0. See here, I've answered a similar question.

    relation4 = foreach relation3 generate relation1::$0..;