I have a very large dataset that I am processing with Pig
.
The data contains a timestamp
(up to the second frequency), and I would like to aggregate my data at the minute frequency (counting how many observations per minnute, averaging other variables over that minute).
Is it possible to do that using Pig
?
Thanks!
You can modify you timestamp field (generate new field like YYYYmmddHHMMss to YYYYmmddHHMM), then group by timestamps and aggregate your data.