There is time series trading transactions history stored with google's BigQuery.
# Transaction history scheme
exchange_id INTEGER REQUIRED
from_id INTEGER REQUIRED
to_id INTEGER REQUIRED
price FLOAT REQUIRED
size FLOAT REQUIRED
ts TIMESTAMP REQUIRED
is_sell BOOLEAN NULLABLE
_PARTITIONTIME TIMESTAMP NULLABLE
exchange_id - platform where transation occured
from_id - base symbol
to_id - quote symbol
price - trade price
size - trade quantity
I need to aggregate OHLC data over 30 seconds time interval grouped by
exchange_id, from_id, to_id
. How can I do this in the BigQuery?
# Required OHLC aggregated data scheme
ts TIMESTAMP REQUIRED
exchange_id INTEGER REQUIRED
from_id INTEGER REQUIRED
to_id INTEGER REQUIRED
open FLOAT REQUIRED
high FLOAT REQUIRED
low FLOAT REQUIRED
close FLOAT REQUIRED
volume FLOAT REQUIRED
_PARTITIONTIME TIMESTAMP NULLABLE
open - first price in interval
high - highest price..
low - lowest price..
close - last price..
volume - SUM of all trade size's in current interval
Most promising ideas were:
SELECT
TIMESTAMP_SECONDS(
UNIX_SECONDS(ts) -
60 * 1000000
) AS time,
exchange_id,
from_id,
to_id,
MIN(price) as low,
MAX(price) as high,
SUM(size) as volume
FROM
`table`
GROUP BY
time, exchange_id, from_id, to_id
ORDER BY
time
And this one:
SELECT
exchange_id,from_id,to_id,
MAX(price) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as high,
MIN(price) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as low,
SUM(size) OVER (PARTITION BY exchange_id,from_id,to_id ORDER BY ts RANGE BETWEEN 60 * 1000000 PRECEDING AND CURRENT ROW) as volume,
FROM [table];
# returns:
1 1 4445 3808 9.0E-8 9.0E-8 300000.0
2 1 4445 3808 9.0E-8 9.0E-8 300000.0
3 1 4445 3808 9.0E-8 9.0E-8 300000.0
...
14 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
15 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
16 1 4445 3808 9.0E-8 9.0E-8 865939.3721800799
But nothing of this works. It seems that I missing something important about sliding window in BigQuery.
Below is for BigQuery Standard SQL
#standardsql
SELECT
exchange_id,
from_id,
to_id,
TIMESTAMP_SECONDS(DIV(UNIX_SECONDS(ts), 30) * 30) time,
ARRAY_AGG(price ORDER BY ts LIMIT 1)[SAFE_OFFSET(0)] open,
MAX(price) high,
MIN(price) low,
ARRAY_AGG(price ORDER BY ts DESC LIMIT 1)[SAFE_OFFSET(0)] close,
SUM(size) volume
FROM `yourproject.yourdataset.yourtable`
GROUP BY 1, 2, 3, 4