Search code examples
sql-serverblocking

Concurrent inserts into table with non-clustered GUID primary key


I have this table on Microsoft SQL Server 2012:

CREATE TABLE [dbo].[Addresses_line_format]
(
    address_id UNIQUEIDENTIFIER NOT NULL
        CONSTRAINT pk_addresses_line_format PRIMARY KEY NONCLUSTERED,

    country_id UNIQUEIDENTIFIER NOT NULL
        CONSTRAINT fk_address_single_line_country FOREIGN KEY REFERENCES Countries (country_id)
            ON UPDATE NO ACTION
            ON DELETE NO ACTION,

    address_line NVARCHAR(255) NOT NULL,
    district_line NVARCHAR(255) NOT NULL
)

With 3.362.817 records in it.

Our application consumes messages from a queue, with 10 concurrent consumers. Each consumer inserts a line into this table, using the following statement:

INSERT [dbo].[Addresses_line_format] ([address_id], [country_id], [address_line], [district_line])
VALUES (@0, @1, @2, @3)

Looking at statistics, the average elapsed time for this query is 16 seconds, which is obviously way too much.

I'm wondering if this is because of how heap tables are handling inserts like described here, or do you have any ideas what is causing this?

I tried changing the PK to be clustered, but without any noticeable performance improvements.

Queries against the table are always performed using the following:

SELECT country_id, address_line, district_line
FROM Addresses_line_format
WHERE address_id = @1

Solution

  • Well, if that GUID isn't the clustered key - what IS the clustered key on that table? It should have one - a well chosen clustered key speeds up operations - even inserts and deletes! See Kimberly Tripp's blog post The Clustered Index Debate Continues... for a great explanation and more background.

    When you read Kim Tripp's blog post and all her other articles on the subject, it's clear that a good clustering key is narrow, unique, static and ever-increasing - perfect fits for an INT or BIGINT identity column.

    Earlier versions of SQL Server (before 2000 or 2005) did in fact have insert hotspots if all the inserts were happening in a single spot - those negative impacts have since been removed, those are no longer a problem, and therefore, using an INT IDENTITY column as your clustering key is a nearly optimal choice for the most part.