I have to get the filename with each row so i used
data = LOAD 'data.csv' using PigStorage(',','-tagFile') AS (filename:chararray);
But in data.csv some columns have comma(,) in content as well so to handle comma issue i used
data = LOAD 'data.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage()AS (filename:chararray);
But I didn't get any option to use -tagFile option with CSVExcelStorage. Please let me know how can i use CSVExcelStorage and -tagFile option at once?
Thanks
I got the way to perform both operation(get the file name in each row and replace delimiter if it appears in column content)
data = LOAD 'data.csv' using PigStorage(',','-tagFile') AS (filename:chararray, record:chararray);
/*replace comma(,) if it appears in column content*/
replaceComma = FOREACH data GENERATE filename, REPLACE (record, ',(?!(([^\\"]*\\"){2})*[^\\"]*$)', '');
/*replace the quotes("") which is present around the column if it have comma(,) as its a csv file feature*/
replaceQuotes = FOREACH replaceComma GENERATE filename, REPLACE ($4,'"','') as record;
Once data is loaded properly without comma , i am free to perform any operation. Detailed use case is available at my blog