Let's assume I've got a database with two tables: people
which contains person's id and his/her birth year for each person and parents
which contains the (parent_id, child_id)
pairs to represent the relative relationships between people. To make the explanation easier let's assume each person has either 0 children or 1 child. Here is an example of the data in the database (as a set of SQL statements to create it on MySQL):
CREATE TABLE people (
id INTEGER NOT NULL AUTO_INCREMENT,
birth_year INTEGER NOT NULL,
CONSTRAINT PRIMARY KEY(id)
);
CREATE TABLE parents (
parent_id INTEGER NOT NULL,
child_id INTEGER NOT NULL,
CONSTRAINT PRIMARY KEY(parent_id,child_id)
);
-- Not creating FOREIGN KEYS, because it's just an example
INSERT INTO people (id, birth_year) VALUES (1, 1937);
INSERT INTO people (id, birth_year) VALUES (2, 1943);
INSERT INTO people (id, birth_year) VALUES (3, 1974);
INSERT INTO people (id, birth_year) VALUES (4, 2001);
INSERT INTO people (id, birth_year) VALUES (5, 2020);
INSERT INTO parents (parent_id, child_id) VALUES (1, 4);
INSERT INTO parents (parent_id, child_id) VALUES (3, 5);
Result:
Now I want to make up a query which will retrieve the id
of the person, whose child was born at the earliest age of the parent (for example, if I was born in 1234 and my child was born in 1300, then my age when my child was born was 1300 - 1234 = 66
and I would like to find a person which got their child earlier than others).
I have made up some queries for it, but each of them either didn't work or had duplications or both. The one I like most is
SELECT id AS pid, -- Parent id
(SELECT child_id FROM parents WHERE parent_id=pid) AS cid -- Child id
FROM people WHERE
EXISTS(SELECT cid) -- Only selecting parents who have children (not sure this part is correct)
ORDER BY (birth_year - (SELECT birth_year FROM people WHERE id=cid)) ASC -- Order by age when they got their child
LIMIT 1;
But this one fails in MySQL with the error:
ERROR 1054 (42S22) at line 24: Unknown column 'cid' in 'field list'
How do I fix the error? Another thing I am worried about is that as a result, I will select not only the parent's id but also the id of one of his/her children. Is it possible to avoid it? Probably there is a better way to select the data I'm looking for?
You can get the ages by using joins:
select c.*, (p.birth_year - c.birth_year) as parent_age
from parents pa join
people p
on pa.parent_id = p.id join
people c
on pa.child_id = pc.id;
To get all the rows with the minimum, use window functions:
select x.*
from (select c.*, (p.birth_year - c.birth_year) as parent_age,
min(p.birth_year - c.birth_year) over () as min_age
from parents pa join
people p
on pa.parent_id = p.id join
people c
on pa.child_id = pc.id
) x
where parent_age = min_age;