I am trying to query two bigquery tables from two different datasets to get 2 separate columns. I have both tried union and joins but they are not giving me want I need. Below is the query I tried
with abagrowth as (
SELECT
session abas,
term abat,
COUNT(distinct studentid) AS acount,
ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS agrowth
FROM
aba.abaresult
GROUP BY
1,
2
ORDER BY
1,
2),
bidagrowth as (
SELECT
session bidas,
term bidat,
COUNT(distinct studentid) AS bcount,
ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS bgrowth
FROM
bida.bidaresult
GROUP BY
1,
2
ORDER BY
1,
2)
select abas, agrowth from abagrowth
union all
select bidas, bgrowth from bidagrowth
The dataset is similar to this
name subject session totalscore
-------------------------------------------
jack maths 2013/2014 70
jane maths 2013/2014 65
jill maths 2013/2014 80
jack maths 2014/2015 72
jack eng 2014/2015 87
jane science 2014/2015 67
jill maths 2014/2015 70
jerry eng 2014/2015 70
jaasp science 2014/2015 85
The table I am trying to get is meant to be in this format or something similar
session agrowth bgrowth
2013/2014 null null
2014/2015 10% 11%
2015/2016 5% 2%
The figures above are assumed for example sake.
Questions
Is this possible with bigquery?
If yes, How can one achieve this?
Thanks
Regarding the dataset. Yes, you can query two datasets. Check out this answer. Basically, you just need to indicate the project (optional), dataset and table you're using.
For the data you want to get. You can achieve it using a JOIN rather than UNION. JOINING the tables by session will allow you to have one row per session. Then you can choose what columns to include in your SELECT.
WITH abagrowth AS (
SELECT
session,
term abat,
COUNT(distinct studentid) AS acount,
ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS agrowth
FROM
aba.abaresult
GROUP BY
1,
2
ORDER BY
1,
2),
bidagrowth AS (
SELECT
session,
term bidat,
COUNT(distinct studentid) AS bcount,
ROUND(100 * (COUNT(distinct studentid) - LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session)) / LAG(COUNT(distinct studentid), 1) OVER (ORDER BY session),0) || '%' AS bgrowth
FROM
bida.bidaresult
GROUP BY
1,
2
ORDER BY
1,
2)
SELECT aba.session, aba.agrowth, bida.bgrowth
FROM abagrowth aba
JOIN bidagrowth bida
ON aba.session = bida.session
UNION will stack the results from the two queries.