I'm trying to join the comments to the posts for multiple tables. I need an AS clause because the posts table and comments table share a column 'score'.
My goal is to be able to find the top comments within the top posts with the data in all these tables.
#standardSQL
SELECT posts.title, posts.url, posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
comments.body, comments.score AS commentsscore, comments.id
FROM
fh-bigquery.reddit_posts.2015_12
,
fh-bigquery.reddit_posts.2016_01
,
fh-bigquery.reddit_posts.2016_02
,
fh-bigquery.reddit_posts.2016_03
,
fh-bigquery.reddit_posts.2016_04
,
fh-bigquery.reddit_posts.2016_05
,
fh-bigquery.reddit_posts.2016_06
,
fh-bigquery.reddit_posts.2016_07
,
fh-bigquery.reddit_posts.2016_08
,
fh-bigquery.reddit_posts.2016_09
,
fh-bigquery.reddit_posts.2016_10
,
fh-bigquery.reddit_posts.2016_11
,
fh-bigquery.reddit_posts.2016_12
,
fh-bigquery.reddit_posts.2017_01
,
fh-bigquery.reddit_posts.2017_02
,
fh-bigquery.reddit_posts.2017_03
,
fh-bigquery.reddit_posts.2017_04
,
fh-bigquery.reddit_posts.2017_05
,
fh-bigquery.reddit_posts.2017_06
,
fh-bigquery.reddit_posts.2017_07
,
fh-bigquery.reddit_posts.2017_08
,
fh-bigquery.reddit_posts.2017_09
,
fh-bigquery.reddit_posts.2017_10
,
fh-bigquery.reddit_posts.2017_11
,
fh-bigquery.reddit_posts.2017_12
,
fh-bigquery.reddit_posts.2018_01
,
fh-bigquery.reddit_posts.2018_02
,
fh-bigquery.reddit_posts.2018_03
,
fh-bigquery.reddit_posts.2018_04
,
fh-bigquery.reddit_posts.2018_05
,
fh-bigquery.reddit_posts.2018_06
,
fh-bigquery.reddit_posts.2018_07
,
fh-bigquery.reddit_posts.2018_08
,
fh-bigquery.reddit_posts.2018_09
,
fh-bigquery.reddit_posts.2018_10
AS posts
JOIN
fh-bigquery.reddit_comments.2015_12
,
fh-bigquery.reddit_comments.2016_01
,
fh-bigquery.reddit_comments.2016_02
,
fh-bigquery.reddit_comments.2016_03
,
fh-bigquery.reddit_comments.2016_04
,
fh-bigquery.reddit_comments.2016_05
,
fh-bigquery.reddit_comments.2016_06
,
fh-bigquery.reddit_comments.2016_07
,
fh-bigquery.reddit_comments.2016_08
,
fh-bigquery.reddit_comments.2016_09
,
fh-bigquery.reddit_comments.2016_10
,
fh-bigquery.reddit_comments.2016_11
,
fh-bigquery.reddit_comments.2016_12
,
fh-bigquery.reddit_comments.2017_01
,
fh-bigquery.reddit_comments.2017_02
,
fh-bigquery.reddit_comments.2017_03
,
fh-bigquery.reddit_comments.2017_04
,
fh-bigquery.reddit_comments.2017_05
,
fh-bigquery.reddit_comments.2017_06
,
fh-bigquery.reddit_comments.2017_07
,
fh-bigquery.reddit_comments.2017_08
,
fh-bigquery.reddit_comments.2017_09
,
fh-bigquery.reddit_comments.2017_10
,
fh-bigquery.reddit_comments.2017_11
,
fh-bigquery.reddit_comments.2017_12
,
fh-bigquery.reddit_comments.2018_01
,
fh-bigquery.reddit_comments.2018_02
,
fh-bigquery.reddit_comments.2018_03
,
fh-bigquery.reddit_comments.2018_04
,
fh-bigquery.reddit_comments.2018_05
,
fh-bigquery.reddit_comments.2018_06
,
fh-bigquery.reddit_comments.2018_07
,
fh-bigquery.reddit_comments.2018_08
,
fh-bigquery.reddit_comments.2018_09
,
fh-bigquery.reddit_comments.2018_10
AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE posts.subreddit = 'Showerthoughts' AND posts.score >100 AND comments.score >100
ORDER BY posts.score DESC
My goal is to be able to find the top comments within the top posts with the data in all these tables.
Ok, so the issues with this query:
JOIN
, not UNION
. So you would need to UNION
the tables.*
at the end of a table name to expand to all matching tables.With that said, a working query would be:
#standardSQL
SELECT posts.title, posts.url, posts.score AS postsscore,
DATE_TRUNC(DATE(TIMESTAMP_SECONDS(posts.created_utc)), MONTH),
SUBSTR(comments.body, 0, 80), comments.score AS commentsscore, comments.id
FROM `fh-bigquery.reddit_posts.2015*` AS posts
JOIN `fh-bigquery.reddit_comments.2015*` AS comments
ON posts.id = SUBSTR(comments.link_id, 4)
WHERE posts.subreddit = 'Showerthoughts'
AND posts.score >100
AND comments.score >100
ORDER BY posts.score DESC