Newbie to MySQL and SQL in general - so please be gentle :-)
I have a table with a very high number of rows. The table is:
create table iostat (
pkey int not null auto_increment,
serverid int not null,
datestr char(15) default 'NULL',
esttime int not null default 0,
rs float not null default 0.0,
ws float not null default 0.0,
krs float not null default 0.0,
kws float not null default 0.0,
wait float not null default 0.0,
actv float not null default 0.0,
wsvct float not null default 0.0,
asvct float not null default 0.0,
pctw int not null default 0,
pctb int not null default 0,
device varchar(50),
avgread float not null default 0.0,
avgwrit float not null default 0.0,
primary key (pkey),
index i_serverid (serverid),
index i_esttime (esttime),
index i_datestr (datestr),
index i_rs (rs),
index i_ws (ws),
index i_krs (krs),
index i_kws (kws),
index i_wait (wait),
index i_actv (actv),
index i_wsvct (wsvct),
index i_asvct (asvct),
index i_pctb (pctb),
index i_device (device),
index i_servdate (serverid, datestr),
index i_servest (serverid, esttime)
)
engine = MyISAM
data directory = '${IOSTATdatadir}'
index directory = '${IOSTATindexdir}'
;
Right now the table has 834,317,203 rows.
Yes - I need all the data. The highest level organization of the data is by the collection date (datestr). It is a CHAR instead of a date to preserve the specific date format I use for the various load, extract, and analysis scripts.
Each day adds about 16,000,000 rows.
One of the operations I would like to speed up is (Limit is generally 50 but ranges from 10 to 250):
create table TMP_TopLUNsKRead
select
krs, device, datestr, esttime
from
iostat
where
${WHERECLAUSE}
order by
krs desc limit ${Limit};
WHERECLAUSE is:
serverid = 29 and esttime between X and Y and device like '%t%'
where X and Y are timestamps spanning anywhere from 4 minutes to 24 hours.
I'd prefer to not change the DB engine. This lets me put data and indexes on separate drives which gave me significant overall performance. It's also a total of 1.6 billion rows, which would take an insane amount of time to reload.
device like '%t%'
This is the killer. The leading %
means it is a search of the whole column, or index if it's indexed, not an index lookup. See if you can do without the leading %
.