I have below query where it is taking long time to execute as it is having OR opeartion.Here Institution 59958 is a global institution. It can have stats itself and its children can have stats; so 'parentinstitutionid' with 59958 or institutionid 59958 we require usage data for.so I am using OR opeartor.
select date(SUBDATE(m.timestamp, INTERVAL (day(m.timestamp) -1) day)) as month,
count(*) as c,
sum(case when a.streamid = 5 then 1 else 0 end) as education,
sum(case when a.streamid in(7, 1) then 1 else 0 end) as research,
sum(case when searchterms <> '' then 1 else 0 end) as search
from stats_to_institution as s
join masterstats_innodb as m on s.statid = m.id
left join articles as a on (a.productid >= 49 and a.productid = m.article)
where m.timestamp >= '2022-01-01'
and (s.institutionid = 59958 or s.institutionid in ( select institutionid from institutions where parentinstitutionid = 59958))
group by month;
Here below condition is taking time (s.institutionid = 59958 or s.institutionid in (select institutionid from institutions where parentinstitutionid = 59958))
I cannot use CTE as it is on 5.6 version.Is any other way to modify above condition for good performance?. If I remove s.institutionid = 59958 it takes only 5 secs to run as it will not have OR operator.
Any suggestion on this?
table structure as follows
CREATE TABLE `institutions` (
`InstitutionID` int(11) NOT NULL AUTO_INCREMENT,
`Name` varchar(200) DEFAULT NULL,
`Approved` tinyint(1) NOT NULL DEFAULT '0',
`DateAdded` datetime DEFAULT CURRENT_TIMESTAMP,
`IsAcademic` tinyint(1) DEFAULT NULL,
`IsIndustry` tinyint(1) DEFAULT NULL,
`LogoFile` varchar(50) DEFAULT NULL,
`NotifyLibEveryXRequests` int(11) DEFAULT NULL,
`IsParentInstitution` int(1) NOT NULL DEFAULT '0',
`ParentInstitutionID` int(11) DEFAULT NULL,
PRIMARY KEY (`InstitutionID`),
KEY `Institutions_Name` (`Name`),
KEY `ParentInstitutionID` (`ParentInstitutionID`),
FULLTEXT KEY `Name` (`Name`)
) ;
CREATE TABLE `masterstats_innodb` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`page` text COLLATE utf8_unicode_ci NOT NULL,
`video` int(11) NOT NULL,
`language` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`referrer` text COLLATE utf8_unicode_ci NOT NULL,
`joveuser` varchar(64) COLLATE utf8_unicode_ci DEFAULT NULL,
`timestamp` date NOT NULL DEFAULT '0000-00-00',
PRIMARY KEY (`id`,`timestamp`),
KEY `joveuser` (`joveuser`),
KEY `institutionid` (`institutionid`),
KEY `timestamp` (`timestamp`),
KEY `idx__video_timestamp` (`video`,`timestamp`)
) ;
CREATE TABLE `stats_to_institution` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`statid` int(11) NOT NULL,
`institutionid` int(11) NOT NULL,
PRIMARY KEY (`id`,`institutionid`),
UNIQUE KEY `statid_2` (`statid`,`institutionid`),
KEY `statid` (`statid`),
KEY `institutionid` (`institutionid`)
) ;
CREATE TABLE `articles` (
`ProductID` int(11) NOT NULL,
`Name` varchar(1000) DEFAULT NULL,
`Tags` varchar(1000) NOT NULL,
`D` varchar(2000) DEFAULT NULL,
`Active` tinyint(1) DEFAULT NULL,
`UserID` int(11) DEFAULT NULL,
`DateAdded` datetime DEFAULT NULL,
`Detail_Abstract` text,
`StreamID` int(11) DEFAULT NULL COMMENT '-1 = Errata, 1= Article, 2= Advertisment, 3 = Editorial, 4= Junk, 5=SE',
`DatePublished` datetime DEFAULT NULL,
`AccessType` int(11) DEFAULT NULL COMMENT '-1=Unpublished, 0=Closed, 1=Free, 2=Open, 3 = Open UK',
`Rep_Results` text,
`Stage` int(11) DEFAULT NULL,
`SectionID` int(11) DEFAULT NULL,
PRIMARY KEY (`ProductID`),
KEY `Articles_StreamID_Active_DatePublished` (`StreamID`,`Active`,`DatePublished`),
KEY `articles_idx_sectionid` (`SectionID`),
FULLTEXT KEY `DetailAbstractTest` (`Detail_Abstract`,`Name`),
FULLTEXT KEY `Materials` (`Materials`),
FULLTEXT KEY `title` (`Name`)
);
explain result
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
| 1 | PRIMARY | m | ALL | PRIMARY,timestamp_video,joveuser,institutionid,video_institutionid,user_id,ip_binary,time_on_page,Article,timestamp,idx__video_timestamp | NULL | NULL | NULL | 19653526 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | a | eq_ref | PRIMARY | PRIMARY | 4 | stats.m.Article | 1 | Using where |
| 1 | PRIMARY | s | ref | statid_2,statid,institutionid | statid_2 | 4 | stats.m.id | 1 | Using where; Using index |
| 2 | SUBQUERY | institutions | ref | PRIMARY,ParentInstitutionID | ParentInstitutionID | 5 | const | 173 | Using index |
+----+-------------+--------------+--------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------+---------+-----------------+----------+----------------------------------------------+
No CTE how about join?
select date(SUBDATE(m.timestamp, INTERVAL (day(m.timestamp) -1) day)) as month,
count(*) as c,
sum(case when a.streamid = 5 then 1 else 0 end) as education,
sum(case when a.streamid in(7, 1) then 1 else 0 end) as research,
sum(case when searchterms <> '' then 1 else 0 end) as search
from stats_to_institution as s
join (select institutionid
from institutions
where parentinstitutionid = 59958
union all select 59998
) x on x.institutionid = s.institutionid
join masterstats_innodb as m on s.statid = m.id
left join articles as a on (a.productid >= 49 and a.productid = m.article)
where m.timestamp >= '2022-01-01'
group by month;