I currently am doing this to get some data from our table:
SELECT DISTINCT(CategoryID),Distance FROM glinks_DistancesForTowns WHERE LinkID = $linkID ORDER BY Distance LIMIT 20
I'm iterating over that for every link id we have (50k odd). Them I'm processing them in Perl with:
my @cats;
while (my ($catid,$distance) = $sth->fetchrow) {
push @cats, $cat;
}
I'm trying to see if there is a better way to do this in a sub-query with MySQL, vs doing 50k smaller queries (i.e one per link)
The basic structure of the table is:
glinks_Links
ID
glinks_DistancesForTowns
LinkID
CategoryID
Distance
I'm sure there must be a simple way to do it - but I'm just not seeing it.
As requested - here is a dump of the table structure. Its actually more complex than that, but the other fields just hold values so I've taken those bits out to give a cleaner over-view of the structure:
CREATE TABLE `glinks_DistancesForTowns` (
`LinkID` int(11) DEFAULT NULL,
`CategoryID` int(11) DEFAULT NULL,
`Distance` float DEFAULT NULL,
`isPaid` int(11) DEFAULT NULL,
KEY `LinkID` (`LinkID`),
KEY `CategoryID` (`CategoryID`,`isPaid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
CREATE TABLE `glinks_Links` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Title` varchar(100) NOT NULL DEFAULT '',
`URL` varchar(255) NOT NULL DEFAULT 'http://',
PRIMARY KEY (`ID`),
KEY `booking_hotel_id_fk` (`booking_hotel_id_fk`)
) ENGINE=MyISAM AUTO_INCREMENT=617547 DEFAULT CHARSET=latin1
This is the kind of thing I'm hoping for:
SELECT glinks_Links.ID FROM glinks_Links as links, glinks_DistancesForTowns as distance (
SELECT DISTINCT(CategoryID),Distance FROM distance WHERE distance.LinkID = links.ID ORDER BY Distance LIMIT 20
)
But obviously that doesn't work;)
It sounds like you want the top 20 towns by distance for each link, right?
MySQL 8.0 supports window functions, and this would be the way to write the query:
WITH cte AS (
SELECT l.ID, ROW_NUMBER() OVER(PARTITION BY l.ID ORDER BY d.Distance) AS rownum
FROM glinks_Links as l
JOIN glinks_DistancesForTowns AS d ON d.LinkID = l.ID
) SELECT ID FROM cte WHERE rownum <= 20;
Versions older than 8.0 do not support these features of SQL, so you have to get creative with user-defined variables or self-joins. See for example my answer to How to SELECT the newest four items per category?