Search code examples
sqlsql-serverperformancequery-optimization

How to compare records in one table quickly?


I have a project for literature researches with a table(in SQL Server 2014). I extract words from some books to comparison table. this table may have thousands of words. I write a query to compare words in different documents and want to show them side by side to show that each word frequency in different book. Write this:

SELECT c1.Word,c1.DocumentID,c1.WordCount,c2.DocumentID,c2.WordCount 
FROM dbo.Comparation c1 
JOIN dbo.Comparation c2 ON c2.Word = c1.Word AND c1.DocumentID <> c2.DocumentID

this query takes 15s on my machine. then i decided to show them page by page and write this:

SELECT c1.Word,c1.DocumentID,c1.WordCount,c2.DocumentID,c2.WordCount 
FROM dbo.Comparation c1 
JOIN dbo.Comparation c2 ON c2.Word = c1.Word AND c1.DocumentID <> c2.DocumentID 
ORDER BY c1.DocumentID OFFSET 0 ROWS FETCH NEXT 15 ROWS ONLY;

now for 15 or little more records speed of retrieve was good, but my problem is: when offset is small, every things are OK but when i increase offset query runs slow. Is there any way to load and compare these words fast?

Thank you.

With best regards.

Comparison table


Solution

  • It looks like you are joining on the nvarchar column "Word". As explained here this has performance issues when used over a large number of rows.

    Try maybe generating an extra "WordID" column which would be unique for each word and then joining on that.