Search code examples
sql-serverperformancegeometryquery-optimizationspatial

SQL Server paging query selecting spatial geometry data with where clause very slow compared to without


I've run into an odd combination of requirements that kills the performance of a query. This is a table created by ESRI's ArcGIS server in SQL Server and the query comes from a feature service paging through data in the table.

This is slow (paste the plan):

SELECT
    OBJECTID,
    Shape
FROM 
    dbo.PARCELS   
WHERE  
    dbo.PARCELS.GDB_ARCHIVE_OID IN 
        (SELECT GDB_ARCHIVE_OID 
         FROM  
             (SELECT 
                  GDB_ARCHIVE_OID,
                  ROW_NUMBER() OVER(PARTITION BY OBJECTID ORDER BY GDB_FROM_DATE DESC) rn_, 
                  GDB_IS_DELETE 
              FROM 
                  dbo.PARCELS 
              WHERE  
                  ((GDB_BRANCH_ID = 0 AND GDB_FROM_DATE <= '2023-12-22') 
                   OR (GDB_BRANCH_ID = 1 AND GDB_FROM_DATE <= '2023-12-22'))) br__ 
         WHERE 
             br__.rn_ = 1 AND br__.GDB_IS_DELETE = 0)  
ORDER BY 
    OBJECTID ASC 
    OFFSET 2000 ROWS 
        FETCH NEXT 2000 ROWS ONLY 

This is fast without the shape field which is a geometry data type (paste the plan):

SELECT
    OBJECTID
    --,Shape
FROM
    dbo.PARCELS   
WHERE  
    dbo.PARCELS.GDB_ARCHIVE_OID IN  
        (SELECT GDB_ARCHIVE_OID 
         FROM  
             (SELECT 
                  GDB_ARCHIVE_OID,
                  ROW_NUMBER() OVER(PARTITION BY OBJECTID ORDER BY GDB_FROM_DATE DESC) rn_, 
                  GDB_IS_DELETE 
              FROM 
                  dbo.PARCELS 
              WHERE  
                  ((GDB_BRANCH_ID = 0 AND GDB_FROM_DATE <= '2023-12-22') 
                   OR (GDB_BRANCH_ID = 1 AND GDB_FROM_DATE <= '2023-12-22'))) br__ 
         WHERE 
             br__.rn_ = 1 AND br__.GDB_IS_DELETE = 0)  
ORDER BY 
    OBJECTID ASC 
    OFFSET 2000 ROWS 
        FETCH NEXT 2000 ROWS ONLY 

This is fast without the where clause (paste the plan):

SELECT
    OBJECTID,
    Shape
FROM
    dbo.PARCELS   
--    WHERE  DBO.PARCELS.GDB_ARCHIVE_OID IN  (
--      SELECT GDB_ARCHIVE_OID 
--      FROM  (
--          SELECT GDB_ARCHIVE_OID,ROW_NUMBER() OVER(PARTITION BY OBJECTID ORDER BY GDB_FROM_DATE DESC) rn_, GDB_IS_DELETE 
--          FROM DBO.PARCELS 
--          WHERE  ((GDB_BRANCH_ID = 0 AND GDB_FROM_DATE <= '2023-12-22') OR (GDB_BRANCH_ID = 1 AND GDB_FROM_DATE <= '2023-12-22'))
--          ) br__ 
--      WHERE br__.rn_ = 1 AND br__.GDB_IS_DELETE = 0
--      )  
ORDER BY 
    OBJECTID ASC 
    OFFSET 2000 ROWS 
        FETCH NEXT 2000 ROWS ONLY 

This is fast without the paging (paste the plan):

SELECT
    OBJECTID,
    Shape
FROM
    dbo.PARCELS   
WHERE  
    dbo.PARCELS.GDB_ARCHIVE_OID IN  
        (SELECT 
             GDB_ARCHIVE_OID 
         FROM  
             (SELECT 
                  GDB_ARCHIVE_OID,
                  ROW_NUMBER() OVER(PARTITION BY OBJECTID ORDER BY GDB_FROM_DATE DESC) rn_, 
                  GDB_IS_DELETE 
              FROM 
                  dbo.PARCELS 
              WHERE  
                  ((GDB_BRANCH_ID = 0 AND GDB_FROM_DATE <= '2023-12-22') 
                   OR (GDB_BRANCH_ID = 1 AND GDB_FROM_DATE <= '2023-12-22'))) br__ 
         WHERE 
             br__.rn_ = 1 AND br__.GDB_IS_DELETE = 0)  
    ORDER BY 
        OBJECTID ASC 
    -- OFFSET 2000 ROWS 
    -- FETCH NEXT 2000 ROWS ONLY 

It's the combination of all three, the geometry column, paging and the where clause that kills the query. Without any one of those the execution time is under a second. With all three the execution time is 41 seconds.

Below is the table definition with indexes at the end:

CREATE TABLE [dbo].[PARCELS](
    [OBJECTID] [int] NOT NULL,
    [TXID_NMBR] [nvarchar](25) NULL,
    [TAX_MAP] [nvarchar](10) NULL,
    [CONTROL] [numeric](38, 8) NULL,
    [PARCEL] [nvarchar](25) NULL,
    [TIE_BACK] [nvarchar](25) NULL,
    [LAST_NAME] [nvarchar](50) NULL,
    [FIRST_NAME] [nvarchar](50) NULL,
    [FULL_NAME] [nvarchar](255) NULL,
    [OWNER_ADD] [nvarchar](50) NULL,
    [OWNER_ADD2] [nvarchar](50) NULL,
    [OWNER_CITY] [nvarchar](50) NULL,
    [OWNER_ST] [nvarchar](2) NULL,
    [OWNER_ZIP] [nvarchar](10) NULL,
    [MUNIC] [nvarchar](25) NULL,
    [SITUS] [nvarchar](255) NULL,
    [SITUS_CITY] [nvarchar](50) NULL,
    [SITUS_ST] [nvarchar](2) NULL,
    [SITUS_ZIP] [nvarchar](10) NULL,
    [USE_] [nvarchar](10) NULL,
    [NBHD] [nvarchar](10) NULL,
    [STORIES] [nvarchar](50) NULL,
    [STYLE] [nvarchar](50) NULL,
    [EXTERIOR] [nvarchar](50) NULL,
    [YR_BUILT] [nvarchar](4) NULL,
    [GRADE] [nvarchar](50) NULL,
    [BEDROOMS] [nvarchar](2) NULL,
    [BATHS] [nvarchar](2) NULL,
    [HALF_BATHS] [nvarchar](2) NULL,
    [BSMT] [int] NULL,
    [FIN_BSMT] [int] NULL,
    [HEAT] [nvarchar](50) NULL,
    [CENT_AIR] [nvarchar](50) NULL,
    [FIREPLACE] [nvarchar](5) NULL,
    [TERRAIN] [nvarchar](10) NULL,
    [SEWER] [nvarchar](50) NULL,
    [WATER] [nvarchar](10) NULL,
    [LIV_AREA] [int] NULL,
    [GRANTOR] [nvarchar](255) NULL,
    [DEED] [nvarchar](50) NULL,
    [SALE_DATE] [datetime2](7) NULL,
    [SALE_AMT] [int] NULL,
    [VAL_2_SALE] [numeric](38, 8) NULL,
    [PP_ACRE] [numeric](38, 8) NULL,
    [PP_AC_ADJ] [numeric](38, 8) NULL,
    [PP_SQFT] [numeric](38, 8) NULL,
    [PP_SF_ADJ] [numeric](38, 8) NULL,
    [C_AND_G] [nvarchar](1) NULL,
    [DEED_ACRES] [numeric](38, 8) NULL,
    [LAND_VAL] [int] NULL,
    [BLDG_VAL] [int] NULL,
    [OUT_VAL] [int] NULL,
    [PROP_VAL] [int] NULL,
    [LycoOnline] [nvarchar](255) NULL,
    [TYPE] [nvarchar](10) NULL,
    [MUNI_NAME] [nvarchar](25) NULL,
    [MUNI_TYPE] [nvarchar](25) NULL,
    [created_user] [nvarchar](255) NULL,
    [created_date] [datetime2](7) NULL,
    [last_edited_user] [nvarchar](255) NULL,
    [last_edited_date] [datetime2](7) NULL,
    [SALE_TYPE] [nvarchar](30) NULL,
    [VALID_SALE] [nvarchar](1) NULL,
    [Shape] [geometry] NULL,
    [Calc_Acres] [numeric](38, 8) NULL,
    [Notes] [nvarchar](150) NULL,
    [GlobalID] [uniqueidentifier] NOT NULL,
    [GDB_GEOMATTR_DATA] [varbinary](max) NULL,
    [GDB_ARCHIVE_OID] [int] IDENTITY(1,1) NOT NULL,
    [GDB_FROM_DATE] [datetime2](7) NOT NULL,
    [GDB_IS_DELETE] [smallint] NOT NULL,
    [GDB_BRANCH_ID] [int] NOT NULL,
    [GDB_DELETED_AT] [datetime2](7) NULL,
    [GDB_DELETED_BY] [nvarchar](255) NULL,
 CONSTRAINT [R831_pk] PRIMARY KEY CLUSTERED 
(
    [GDB_ARCHIVE_OID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 95, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

ALTER TABLE [dbo].[PARCELS] SET (LOCK_ESCALATION = DISABLE)
GO

ALTER TABLE [dbo].[PARCELS] ADD  DEFAULT ('{00000000-0000-0000-0000-000000000000}') FOR [GlobalID]
GO

ALTER TABLE [dbo].[PARCELS] ADD  CONSTRAINT [GDB_FROM_DATE831_def]  DEFAULT (CONVERT([datetime2](3),getutcdate())) FOR [GDB_FROM_DATE]
GO

ALTER TABLE [dbo].[PARCELS] ADD  CONSTRAINT [GDB_IS_DELETE831_def]  DEFAULT ((0)) FOR [GDB_IS_DELETE]
GO

ALTER TABLE [dbo].[PARCELS] ADD  CONSTRAINT [GDB_BRANCH_ID831_def]  DEFAULT ((0)) FOR [GDB_BRANCH_ID]
GO

ALTER TABLE [dbo].[PARCELS]  WITH NOCHECK ADD  CONSTRAINT [g799_ck] CHECK  (([SHAPE].[STSrid]=(2271)))
GO

ALTER TABLE [dbo].[PARCELS] NOCHECK CONSTRAINT [g799_ck]
GO

CREATE UNIQUE NONCLUSTERED INDEX [gdb_ct1_831] ON [dbo].[PARCELS]
(
    [OBJECTID] ASC,
    [GDB_FROM_DATE] ASC,
    [GDB_BRANCH_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 75, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

CREATE UNIQUE NONCLUSTERED INDEX [gdb_ct2_831] ON [dbo].[PARCELS]
(
    [GDB_BRANCH_ID] ASC,
    [GDB_FROM_DATE] ASC,
    [GDB_IS_DELETE] ASC,
    [OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 75, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

ALTER TABLE [dbo].[PARCELS] ADD  CONSTRAINT [R831_pk] PRIMARY KEY CLUSTERED 
(
    [GDB_ARCHIVE_OID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 95, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [R831_SDE_ROWID_UK] ON [dbo].[PARCELS]
(
    [OBJECTID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 75, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

CREATE SPATIAL INDEX [S799_idx] ON [dbo].[PARCELS]
(
    [Shape]
)USING  GEOMETRY_GRID 
WITH (BOUNDING_BOX =(2009646.876269, 328920.783953, 2326307.240322, 522100.77817), GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, FILLFACTOR = 95) ON [PRIMARY]
GO

CREATE UNIQUE NONCLUSTERED INDEX [UUID_831] ON [dbo].[PARCELS]
(
    [GlobalID] ASC,
    [GDB_BRANCH_ID] ASC,
    [GDB_FROM_DATE] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO

Any ideas why?


Solution

  • Your query could do with a good bit of improvement, as could your indexes.

    Your primary issue here is basically that there are no indexes that can support the whole query. It's made more complex by an unnecessary self-join.

    Firstly, the query can be rewritten to remove the self-join, which is unnecessary. You can pull all the columns straight through from the ROW_NUMBER query.

    Furthermore, I strongly suggest you consider Keyset Pagination. This means that instead of reading 2000 rows and discarding them, you jump to the exact place in the index that you need.

    So the query can be simplified as follows:

    SELECT TOP (2000)
      br.OBJECTID,
      br.Shape
    FROM (
        SELECT 
          p.*,
          ROW_NUMBER() OVER (PARTITION BY p.OBJECTID ORDER BY p.GDB_FROM_DATE DESC) rn
        FROM 
          dbo.PARCELS p
        WHERE p.GDB_BRANCH_ID BETWEEN 0 AND 1
          AND p.GDB_FROM_DATE <= '2023-12-22'
          AND p.OBJECTID > @previousMaxID    -- remove this line for the first page
    ) br
    WHERE
      br.rn = 1
      AND br.GDB_IS_DELETE = 0
    ORDER BY
      OBJECTID ASC;
    

    Note the use of the previous Max ID, in order to paginate to the next key. Remove the final WHERE clause for the first page.


    For indexing, Keyset Pagination normally requires a deterministic sort with a supporting index. The ROW_NUMBER also does, so it makes sense to use the same one. And given the row-numbering, we don't need to worry about an extra uniqueifier column, because there is guaranteed to be only one row per OBJECTID.

    I would suggest the following index, which will support all the WHERE clauses, as well as the sort requuirements for both the ROW_NUMBER and the Keyset Pagination:

    CREATE NONCLUSTERED INDEX [R831_SDE_ROWID_UK] ON dbo.PARCELS
    ( OBJECTID ASC, GDB_FROM_DATE DESC)
    INCLUDE ( GDB_BRANCH_ID, GDB_IS_DELETE, Shape );
    

    If p.GDB_BRANCH_ID BETWEEN 0 AND 1 is drastically reducing the rowcount then it might make sense to use a different index, perhaps similar to above but with a filter on GDB_BRANCH_ID.


    As you can see, the query plan is nice and neat.

    enter image description here

    db<>fiddle