My data from different Sensor units is comming with unix_time stam in Hadoop. It is quite easy to get the first and last value of each day in SQL using the FIRST and LAST statement, as given below. Here I am getting the temperature at the start and at the end of the day.
SELECT unit, FIRST(Temp) as Start_Day_Value, LAST(Temp) as End_Day_Value
FROM Sensor_Data.Table
WHERE cast(ts/1000 as TIMESTAMP) BETWEEN "2021-01-05 00:00:00" AND "2021-01-05 23:59:59"
GROUP BY unit
ORDER BY unit;
Is there any equivalent command in the IMPALA so that I can get the first and last value of the each column as am getting in SQL.
I solved the problem in the following way:
SELECT DISTINCT u.unit, u.initial_temp, u.final_temp, u.initial_p, u.final_p
FROM(SELECT unit,
first_value(temperature) OVER(PARTITION BY unit ORDER BY ts DESC) as final_temp,
first_value(temperature) OVER(PARTITION BY unit ORDER BY ts ASC) as initial_temp,
first_value(pressure) OVER(PARTITION BY unit ORDER BY ts DESC) as final_p,
first_value(pressure) OVER(PARTITION BY unit ORDER BY ts ASC) as initial_p
FROM Sensor_Data.Table
WHERE cast(ts/1000 as TIMESTAMP) BETWEEN "2020-12-11 00:00:00" AND "2020-12-11 23:59:59"
) AS u
ORDER BY unit;