I have a table where I've stored some information from a Json object:
Table:
investment
unit(string)
data(string)
If a run the the query SELECT * FROM "db"."investment" limit 10;
I got the following result:
Unit Data
CH [{"from":"CH","when":"2021-02-16","who":"pp@gmail.com"}]
AB [{"from":"AB","when":"2020-02-16","who":"jj@gmail.com"}]
Now, I run the following basic query to return value within the Json nested object:
SELECT json_extract_scalar(Data, '$[0].who') email FROM "db"."investment";
and I got the following result:
email
jj@gmail.com
pp@gmail.com
How can filter this query with WHERE
clause to return just a single value:
I've tried this, but obviously it doesn't work as normal SQL table with row and columns:
SELECT json_extract_scalar(Data, '$[0].who') email FROM "db"."investment" WHERE email = "pp@gmail.com";
Any help with this?
your question seems to have a few typos.
Date
in Unit Date
should probably be Data
key
referring to. Perhaps you mean Data
also, note that athena is case insensitive, and column names are converted to lower case (even if you quote them).
with that out of the way, you have to use the full expression that extracts your email from the json document in the where
clause. the column alias defined is not accessible to the rest of the query.
here's a self contained example:
with test (unit, data) as (
values
('CH', JSON '[{"from":"CH","when":"2021-02-16","who":"pp@gmail.com"}]'),
('AB', JSON '[{"from":"AB","when":"2020-02-16","who":"jj@gmail.com"}]')
)
select json_extract_scalar(data, '$[0].who') email
from test
where json_extract_scalar(data, '$[0].who') = 'pp@gmail.com';
outputs:
| email |
+--------------+
| pp@gmail.com |