I have the below single raw file, and need to split the file into different relations.
If line starts with 0, the complete line should goto relation 'header'
If line starts with 1, the complete line should goto relation 'ban'
If line starts with 2, the complete line should goto relation 'sub'
If line starts with 3, the complete line should goto relation 'item'
If line starts with 4, the complete line should goto relation 'tax'
0ALH 012012050104.00.00356.0012.06001
1980377362 HAW R 120010000IRN+000016323SABRINA D. ORTIZ PO BOX 1764 KAILUA KONA HI967451764September 2009 03.4June 2008 06.0E 00
2980377362 8089363822 HAW 120010000SABRINA D. ORTIZ 75-1027 HENRY ST KAILUA KONA HI967403154September 2009 03.4June 2008 06.0EN00
2980377362 8089375559 HAW 120010000SABRINA D. ORTIZ 75-1027 HENRY ST KAILUA KONA HI967403154September 2009 03.4June 2008 06.0EN00
3980377362 8089363822 911FEEO O SNOTAX1001+000000066201205029-1-1 Service Fee 0000004950533060000002163C
3980377362 8089363822 GSMUSELASCPKG R R S 00000000020120502Custom Call Package 000000495053163
4980377362 8089363822 MSGFTM2AMM2ABUNR L+000003000U 105 +04160000+000000125 0000000000000000495053186
4980377362 8089363822 MSGFTM2AMM2ABUNR L+000003000U 131 +00084600+000000003 0000000000000000495053186
4980377362 8089363822 MSGFTM2AMM2ABUNR L+000003000U 133 +04146600+000000124 0000000000000000495053186
Please can you help me with a pig script to do this ?
Load the data into a single field.Foreach line get the first character of the line and compare it with the values you are looking for and use split to store it into different relations.
A = LOAD '/path/file.txt' USING TextLoader() as (line:chararray);
SPLIT A INTO header IF SUBSTRING(A.line,0,1) == '0',
ban IF SUBSTRING(A.line,0,1) == '1',
sub IF SUBSTRING(A.line,0,1) == '2',
item IF SUBSTRING(A.line,0,1) == '3',
tax IF SUBSTRING(A.line,0,1) == '4';
DUMP header;
DUMP ban;
DUMP sub;
DUMP item;
DUMP tax;