Search code examples
sqlgoogle-bigquerygdelt

Extract URL of theme or text from the GDELT query


I am trying to extract the news related to cryptocurrency from GDELT. I am using the following query:

select date(_partitiontime) date, count(theme) occurences
from `gdelt-bq.gdeltv2.gkg_partitioned`, unnest(split(themes,';')) as theme 
where lower(theme) like "%bitcoin%"
group by date
-- order by date 

I want to extract not only how many times the word bitcoin occurred but also the news article or text as well.

I have also tried:

select Date,SourceCommonName,Themes,Persons,DocumentIdentifier 
from gdelt-bq.gdeltv2.gkg_partitioned, unnest(split(themes,';')) as theme
where theme like "%bitcoin%"
limit 100

but this query returns 0 results. It seems like I am missing something to get the URLs of news related to bitcoin.

Any help will be much appreciated. Thanks!


Solution

  • You should use lower(theme) in where clause of second query