I have unstructured data
key1|a1|a11|a21|a31|a41
key2|b1|b11
key3|c1|c11|c21
key4|d1
key2|b101|b111
key1|a101|a111|a121|a131|a141
Based on the first column, the records are split and distributed to directories.
z = load '/user/input/data.txt' using PigStorage('|');
split z into z1 if $0 == 'key1', z2 if $0 == 'key2', z3 if $0 == 'key3', z4 if $0 == 'key4';
z11 = foreach z1 generate $1,$2,$3,$4,$5;
z22 = foreach z2 generate $1,$2;
z33 = foreach z3 generate $1,$2,$3;
z44 = foreach z4 generate $1;
For the above input : key1|a1|a11|a21|a31|a41
I need the output as "a1|a11|a21|a31|a41" except "key1".
I can get the values by specifying positions
z11 = foreach z1 generate $1,$2,$3,$4,$5;
Is there a way, where I can extract the above data with out specifying positions?
If you dont know exacltly how many field you have, you can use this synthax :
z11 = foreach z1 generate $1..;
z22 = foreach z2 generate $1..;
z33 = foreach z3 generate $1..;
z44 = foreach z4 generate $1..;
So you exclude the 1st field $0
and keep the rest starting from the 2nd field $1
without specifying all of them explecitely