Search code examples
apache-pig

Command required to customize a field in a record using Apache Pig


Sample.txt file

2017-01-01 15:35:18 I had heavy snacks
2017-02-01 12:45:19 I am feeling hungry
2017-03-01 10:25:19 I completed my work that is assigned
  • Requirement: i need the date to be first field, time should be second field and the entire remaining text to be as third field. (Note: This is a space delimited file.) .Please help me with this.

Solution

  • Load the file into a single field and then use STRSPLIT.

    A = LOAD '/path/sample.txt' USING TextLoader() AS (line:chararray);
    B = FOREACH A GENERATE STRSPLIT(line,' ',3); --Note: 3 indicates the field line to be split into 3 parts based on the delimiter space.
    DUMP B;