I am trying to extract the parts per minute produced where v is an aggregated counter of the parts produced till that time.
My azure SQL Query is as below
select
x.fqn,
( max(cast(y.arrayvalue.v as BIGINT))-(min(cast(y.arrayvalue.v as BIGINT)))) as ppm
from
(SELECT
TS.ArrayIndex,
TS.ArrayValue.FQN,
TS.ArrayValue.vqts
FROM
[EventHubInput] as hub
timestamp by y.arrayvalue.t
CROSS APPLY GetArrayElements(hub.timeseries) AS TS) as x
cross apply GetArrayElements(x.vqts) AS y
where x.fqn like '%Production%' and y.arrayvalue.q=192
group by tumblingwindow(minute,1),x.fqn
My input data looks like this
{
"timeSeries": [
{
"fqn":"MyEnterprise.Gateways.GatewayE.CLX.Tags.StateBasic",
"vqts":[
{
"v": "" ,
"q": 192 ,
"t":"2016-06-24T16:39:45.683+0000"
}
]
}, {
"fqn":"MyEnterprise.Gateways.GatewayE.CLX.Tags.ProductionCount",
"vqts":[
{
"v": 264 ,
"q": 192 ,
"t":"2016-06-24T16:39:45.683+0000"
}
]
}, {
"fqn":".Gateways.GatewayE.CLX.Tags.StateDetailed",
"vqts":[
{
"v": "" ,
"q": 192 ,
"t":"2016-06-24T16:39:45.683+0000"
}
]
} ]
My query returns no result. when I remove the timestamp by y.arrayvalue.t and add y.arrayvalue.t in the group by clause, I get some result. I realize that maybe this is because I have more than 1 timestamp field for each event, So I wanted to know if it is possible to assign the time data of the first array to timestamp by...something like timestamp by y[0].t
As of today, Azure Stream Analytics does not support timestamp by over a value inside an array. So the answer to your question "if it is possible to assign the time data of the first array to timestamp by" is NO.
Here is a workaround you can use:
First, flatten the input message in one job and output to a staging Event Hub:
WITH flattenTS AS
(
SELECT
TS.ArrayIndex,
TS.ArrayValue.FQN,
TS.ArrayValue.vqts
FROM [EventHubInput]
CROSS APPLY GetArrayElements(hub.timeseries) AS TS
)
, flattenVQTS AS
(
SELECT
ArrayIndex
,FQN
,vqts.ArrayValue.v as v
,vqts.ArrayValue.q as q
,vqts.ArrayValue.t as t
FROM flattenTS TS
CROSS APPLY GetArrayElements(TS.vqts) AS vqts
)
SELECT *
INTO [staging_eventhub]
FROM flattenVQTS
Then, use another job to read the flattened messages and do the windowed aggregation:
SELECT
FQN
,MAX(CAST(v as BIGINT))-MIN(CAST(v as BIGINT)) as ppm
FROM [staging_eventhub] timestamp by t
WHERE fqn LIKE '%Production%' and q=192
GROUP BY tumblingwindow(minute,1), fqn
You may wonder can we just combine above two jobs as multiple steps in a single job and avoid the staging Event Hub. Unfortunately, you cannot use "timestamp by" when you select from CTE or subquery today.