Search code examples
sqlsql-serverdatabasesql-server-2017

Improve SQL Server query performance on large tables MsSQL vs MySQL


i am having production system running on latest MsSQL, however i am getting very bad performance for simple select statement when there is a lot of rows (10M+).

There is no such thing as hash index in MsSQL (as far as I know) so in my question i compare MySQL community database and MSSQL 2017 running on the same machine with the exception that MySQL runs under windows ubuntu virtual machine.

In sample script i create 100 million rows and execute it. In MS SQL the performance is so bad that the simple query took almost 2 seconds. In MySQL the query took 0.0011 sec. I assume that MS SQL should be better than free MySQL so i expect i miss something with indexing.

Can somebody please advise me how to create more efficient index with MsSQL?

Test table MSSQL:

drop table if exists [dbo].[TEST_TRADE];


CREATE TABLE [dbo].[TEST_TRADE](
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [TradeID] [varchar](250) NOT NULL,
    [ValidFrom] [datetimeoffset](7) NULL,
    [ValidUntil] [datetimeoffset](7) NULL,
)
GO  

CREATE CLUSTERED INDEX IX_Primary   
    ON [dbo].[TEST_TRADE] ([ID]);   
GO  

CREATE NONCLUSTERED INDEX IX_TradeID   
    ON [dbo].[TEST_TRADE] ([TradeID]);   
GO

CREATE NONCLUSTERED INDEX IX_ValidUntil   
    ON [dbo].[TEST_TRADE] ([ValidUntil]);   
GO

Declare @Id int
Set @Id = 1

While @Id <= 100000000
Begin 
   Insert Into [dbo].[TEST_TRADE] ([TradeID]) values (
   'ID'+CAST(@Id as nvarchar(10))),(
   'ID'+CAST(@Id+1 as nvarchar(10))),(
   'ID'+CAST(@Id+2 as nvarchar(10))),(
   'ID'+CAST(@Id+3 as nvarchar(10))),(
   'ID'+CAST(@Id+4 as nvarchar(10))),(
   'ID'+CAST(@Id+5 as nvarchar(10))),(
   'ID'+CAST(@Id+6 as nvarchar(10))),(
   'ID'+CAST(@Id+7 as nvarchar(10))),(
   'ID'+CAST(@Id+8 as nvarchar(10))),(
   'ID'+CAST(@Id+9 as nvarchar(10))),(
   'ID'+CAST(@Id+10 as nvarchar(10))),(
   'ID'+CAST(@Id+11 as nvarchar(10))),(
   'ID'+CAST(@Id+12 as nvarchar(10))),(
   'ID'+CAST(@Id+13 as nvarchar(10))),(
   'ID'+CAST(@Id+14 as nvarchar(10))),(
   'ID'+CAST(@Id+15 as nvarchar(10))),(
   'ID'+CAST(@Id+16 as nvarchar(10))),(
   'ID'+CAST(@Id+17 as nvarchar(10))),(
   'ID'+CAST(@Id+18 as nvarchar(10))),(
   'ID'+CAST(@Id+19 as nvarchar(10))
   )
   Print @Id
   Set @Id = @Id + 20
End

MySQL test table:

drop table if exists `Test`.`TEST_TRADE`;
CREATE TABLE `Test`.`TEST_TRADE` ( `ID` BIGINT NOT NULL AUTO_INCREMENT, `TradeID` VARCHAR(250) NULL default null, `ValidFrom` TIMESTAMP NULL default null, `ValidUntil` TIMESTAMP NULL default null, PRIMARY KEY (`ID`), INDEX (`TradeID`) USING HASH, INDEX (`ValidUntil`)) ENGINE = InnoDB;

drop PROCEDURE if EXISTS InsertRand;
DELIMITER $$
CREATE PROCEDURE InsertRand(IN NumRows INT)
    BEGIN
        DECLARE i INT;
        SET i = 1;
        START TRANSACTION;
        WHILE i <= NumRows DO
            INSERT INTO TEST_TRADE(TradeID) VALUES 
            (concat("ID",i)),
            (concat("ID",i+1)),
            (concat("ID",i+2)),
            (concat("ID",i+3)),
            (concat("ID",i+4)),
            (concat("ID",i+5)),
            (concat("ID",i+6)),
            (concat("ID",i+7)),
            (concat("ID",i+8)),
            (concat("ID",i+9)),
            (concat("ID",i+10)),
            (concat("ID",i+11)),
            (concat("ID",i+12)),
            (concat("ID",i+13)),
            (concat("ID",i+14)),
            (concat("ID",i+15)),
            (concat("ID",i+16)),
            (concat("ID",i+17)),
            (concat("ID",i+18)),
            (concat("ID",i+19))
            ;
            SET i = i + 20;
        END WHILE;
        COMMIT;
    END$$
DELIMITER ;

CALL InsertRand(100000000);

SQL Profiler execution

MySQL execution


Solution

  • If you are selecting 10m + rows, data size per row for you SQL server table is 42 bytes.
    (bigint = 8 bytes) + (varchar of 12 chars = 14 bytes) + (datetimeoffset = 10 bytes)*2 So 10 million rows should be 420 000 000 bytes which is aproximate 400MB.

    So reading 400MB data in 2 sec is 400MB/2s = 200MB/s, which is a reasonable speed for accessing hard drive. But 400MB/0.0011s is 363 636,36MB/s, that is way beyond any hard drive speed and very similar to the RAM accessing speed.

    So the table in MySQL must be fully cached in Memory and that is why your query can be finshed at 0.0011 sec.

    You need to find a way to cached you MsSQL table fully in Memory too in order to achieve similar speed.

    Edited: If your query is

    Select * FROM [dbo].[TEST_TRADE]
    where [TradeId] = 'ID99999999'
    

    Create clustered index on column [TradeId] first, then create PK on [ID], and if possible, use fixed length char(12) for [TradeId] instread of varchar.

    Edited:

    Tested, making a clustered index on column [TradeId] will speed the query up to 50%.

    And I would suggest to check the index fragmentation for [IX_TradeID] first and rebuild the index regularly.