I am using sql alchemy to query BigQuery.
I have a subquery and I am trying to dedup it using ArrayAgg
basically it should look like this:
SELECT
t.id,
ARRAY_AGG (t ORDER BY `time_col` DESC LIMIT 1)[OFFSET(0)].*
from (
SELECT ...
) t
GROUP BY t.id
I created a compiler function that generates the ARRAY_AGG as I desire.
The problem is, when I try to execute the query, the array agg field gets an automatic label which ruins everything.
Is there a way to just make it use the *
?
So I found the answer with the help of my colleague, and wanted to put it here should anyone will ever face this too
We basically generated the following query without the *
so we got a struct, and wrapped it with another select to extract the fields.
We made a function that wraps the query with this logic and returns something like this:
SELECT deduped_record.field_1, ..., deduped_record.field_n
FROM (
SELECT dedup_by_field,
ARRAY_AGG(source_table ORDER BY dedup_time_field DESC LIMIT 1)[OFFSET(0)] AS deduped_record
FROM (
SELECT field_1,...,field_n
FROM table
)
GROUP BY dedup_by_field
) AS alias
We just get the fields via the alias, instead of the *
.