Search code examples
apache-pig

How to extract two groups of tuples into a single variable?


I need to extract the LOG_COUNT1 and LOG_COUNT2 into a single group of two records of structure: police_force, date, crime_count

joined_tbl: {group: chararray,LOG_COUNT1: {(police_force: chararray,date: chararray,crime_count: long)},LOG_COUNT2: {(police_force: chararray,date: chararray,crime_count: long)}}

Solution

  • Try,

    F1 = FOREACH joined_tbl GENERATE group, FLATTEN(LOG_COUNT1) ;
    F2 = FOREACH joined_tbl GENERATE group, FLATTEN(LOG_COUNT2) ;
    U = UNION F1, F2 ;
    # merged_tbl = FOREACH (GROUP U BY group) GENERATE group AS group, U.(police_force, date, crime_count) AS LOG_COUNT;
    DUMP U;