I'm developing a MySQL database to handle transport schedules. Each train line or bus journey has a unique identifier and the journey from start
to finish
might not call at all potential stopping off points.
I have three tables: schedule, station and journey. The latter two only contain an auto-increment id and the varchar identifier at present. The schedule table contains an auto-increment id, s_id and j_id (referring to the station and journey id respectively), the stop_order and the arrival/departure times.
Running this SQL:
SELECT schedules.id, journeys.journey, stations.station, schedules.s_order,
schedules.a_time, schedules.d_time
FROM `schedules`
JOIN `journeys` ON schedules.j_id = journeys.j_id
JOIN `stations` ON schedules.s_id = stations.s_id
returns this result:
+----+---------+---------+---------+----------+----------+
| id | journey | station | s_order | a_time | d_time |
+----+---------+---------+---------+----------+----------+
| 1 | J1 | STA | 1 | NULL | 07:00:00 |
| 2 | J1 | STB | 2 | 07:09:00 | 07:10:00 |
| 3 | J1 | STC | 3 | 07:29:00 | 07:30:00 |
| 4 | J1 | STD | 4 | 07:44:00 | 07:45:00 |
| 5 | J1 | STE | 5 | 07:59:00 | 08:00:00 |
| 6 | J1 | STF | 6 | 08:15:00 | NULL |
| 7 | J2 | STA | 1 | NULL | 07:30:00 |
| 8 | J2 | STC | 2 | 07:59:00 | 08:00:00 |
| 9 | J2 | STF | 3 | 08:45:00 | NULL |
| 10 | J3 | STA | 1 | NULL | 08:00:00 |
| 11 | J3 | STB | 2 | 08:09:00 | 08:10:00 |
| 12 | J3 | STD | 3 | 08:44:00 | 08:45:00 |
| 13 | J3 | STE | 4 | 09:00:00 | NULL |
| 14 | J4 | STA | 1 | NULL | 08:30:00 |
| 15 | J4 | STD | 2 | 09:14:00 | 09:15:00 |
| 16 | J4 | STF | 3 | 09:45:00 | NULL |
| 17 | J5 | STF | 1 | NULL | 07:10:00 |
| 18 | J5 | STE | 2 | 07:24:00 | 07:25:00 |
| 19 | J5 | STD | 3 | 07:39:00 | 07:40:00 |
| 20 | J5 | STC | 4 | 07:54:00 | 07:55:00 |
| 21 | J5 | STB | 5 | 08:14:00 | 08:15:00 |
| 22 | J5 | STA | 6 | 08:25:00 | NULL |
| 23 | J6 | STF | 1 | NULL | 07:30:00 |
| 24 | J6 | STC | 2 | 08:14:00 | 08:15:00 |
| 25 | J6 | STA | 3 | 08:45:00 | NULL |
| 26 | J7 | STE | 1 | NULL | 08:00:00 |
| 27 | J7 | STD | 2 | 08:14:00 | 08:15:00 |
| 28 | J7 | STB | 3 | 08:49:00 | 08:50:00 |
| 29 | J7 | STA | 4 | 09:00:00 | NULL |
| 30 | J8 | STF | 1 | NULL | 08:20:00 |
| 31 | J8 | STD | 2 | 08:49:00 | 08:50:00 |
| 32 | J8 | STA | 3 | 09:35:00 | NULL |
+----+---------+---------+---------+----------+----------+
There are four journeys (J1 to J8) in each direction.
I need to be able to answer questions like:
Which journeys go from STB to STD (answer J1 and J3) but exclude journeys that go in the opposite direction (J5 and J7)?
Which journeys stop at STC (answer J1, J2, J5 and J6)?
I've tried subqueries, a self join, creating a temporary table then querying that but I'm hitting my head against a brick wall. I don't even know if the design of my tables is optimal.
Finally, here's the SQL to create and populate the tables:
CREATE TABLE `journeys` (
`id` int(11) NOT NULL,
`j_id` int(11) NOT NULL,
`journey` varchar(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `journeys` (`id`, `j_id`, `journey`) VALUES
(1, 1, 'J1'),
(2, 2, 'J2'),
(3, 3, 'J3'),
(4, 4, 'J4'),
(5, 5, 'J5'),
(6, 6, 'J6'),
(7, 7, 'J7'),
(8, 8, 'J8');
.
CREATE TABLE `stations` (
`id` int(11) NOT NULL,
`s_id` int(11) NOT NULL,
`station` varchar(3) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `stations` (`id`, `s_id`, `station`) VALUES
(1, 1, 'STA'),
(2, 2, 'STB'),
(3, 3, 'STC'),
(4, 4, 'STD'),
(5, 5, 'STE'),
(6, 6, 'STF');
.
CREATE TABLE `schedules` (
`id` int(11) NOT NULL,
`j_id` int(11) NOT NULL,
`s_id` int(11) NOT NULL,
`s_order` int(11) NOT NULL,
`a_time` time DEFAULT NULL,
`d_time` time DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `schedules` (`id`, `j_id`, `s_id`, `s_order`, `a_time`, `d_time`)
VALUES
(1, 1, 1, 1, NULL, '07:00:00'),
(2, 1, 2, 2, '07:09:00', '07:10:00'),
(3, 1, 3, 3, '07:29:00', '07:30:00'),
(4, 1, 4, 4, '07:44:00', '07:45:00'),
(5, 1, 5, 5, '07:59:00', '08:00:00'),
(6, 1, 6, 6, '08:15:00', NULL),
(7, 2, 1, 1, NULL, '07:30:00'),
(8, 2, 3, 2, '07:59:00', '08:00:00'),
(9, 2, 6, 3, '08:45:00', NULL),
(10, 3, 1, 1, NULL, '08:00:00'),
(11, 3, 2, 2, '08:09:00', '08:10:00'),
(12, 3, 4, 3, '08:44:00', '08:45:00'),
(13, 3, 5, 4, '09:00:00', NULL),
(14, 4, 1, 1, NULL, '08:30:00'),
(15, 4, 4, 2, '09:14:00', '09:15:00'),
(16, 4, 6, 3, '09:45:00', NULL),
(17, 5, 6, 1, NULL, '07:10:00'),
(18, 5, 5, 2, '07:24:00', '07:25:00'),
(19, 5, 4, 3, '07:39:00', '07:40:00'),
(20, 5, 3, 4, '07:54:00', '07:55:00'),
(21, 5, 2, 5, '08:14:00', '08:15:00'),
(22, 5, 1, 6, '08:25:00', NULL),
(23, 6, 6, 1, NULL, '07:30:00'),
(24, 6, 3, 2, '08:14:00', '08:15:00'),
(25, 6, 1, 3, '08:45:00', NULL),
(26, 7, 5, 1, NULL, '08:00:00'),
(27, 7, 4, 2, '08:14:00', '08:15:00'),
(28, 7, 2, 3, '08:49:00', '08:50:00'),
(29, 7, 1, 4, '09:00:00', NULL),
(30, 8, 6, 1, NULL, '08:20:00'),
(31, 8, 4, 2, '08:49:00', '08:50:00'),
(32, 8, 1, 3, '09:35:00', NULL);
I hope I've given enough information to allow someone to be able to help.
Thanks for your time and patience.
*****UPDATE*****
OK, so I've made some progress. Run the following SQL in separate segments (it errors if I run it all together):
DROP TEMPORARY TABLE IF EXISTS tableSTART;
.
DROP TEMPORARY TABLE IF EXISTS tableEND;
.
CREATE TEMPORARY TABLE IF NOT EXISTS tableSTART AS
(
SELECT schedules.id, journeys.journey, stations.station, schedules.s_order,
schedules.a_time, schedules.d_time
FROM `schedules`
JOIN `journeys` ON schedules.j_id = journeys.j_id
JOIN `stations` ON schedules.s_id = stations.s_id
WHERE stations.station = "STB"
);
.
CREATE TEMPORARY TABLE IF NOT EXISTS tableEND AS
(
SELECT schedules.id, journeys.journey, stations.station, schedules.s_order,
schedules.a_time, schedules.d_time
FROM `schedules`
JOIN `journeys` ON schedules.j_id = journeys.j_id
JOIN `stations` ON schedules.s_id = stations.s_id
WHERE stations.station = "STE"
);
.
select ts.journey, ts.station startStn, ts.s_order startOrder, te.journey,
te.station endStn, te.s_order endOrder
from tableSTART ts
cross join tableEND te
This shows the journeys, start (STB) and end (STE) stations along with the station order of the start and end stations. How do I filter this to leave only those journeys where the start order is less than the end order (i.e travelling from STB to STE and excluding journeys from STE to STB?
Thanks in advance.
We want journeys that:
select j.journey
from journeys j
-- have some scheduled stop at STB
join schedules sc1 on sc1.j_id = j.id
join stations st1 on st1.id = sc1.s_id
and st1.station = 'STB'
-- have some (other) stop scheduled at STD
join schedules sc2 on sc2.j_id = j.id
join stations st2 on st2.id = sc2.s_id
and st2.station = 'STD'
-- stop at STB before STD
where sc1.s_order < sc2.s_order;
Note you may also write:
select j.journey
from journeys j
join schedules sc1 on sc1.j_id = j.id
join stations st1 on st1.id = sc1.s_id
join schedules sc2 on sc2.j_id = j.id
join stations st2 on st2.id = sc2.s_id
where st1.station = 'STB'
and st2.station = 'STD'
and sc1.s_order < sc2.s_order;
(putting conditions in the on
of inner joins is the same putting them in where
)
(probably trivial, after the above)
select j.journey
from journeys j
join schedules sc on sc.j_id = j.id
join stations st on st.id = sc.s_id
where st.station = 'STC';
Note that this is exactly your query with less columns and a where
:)