I have a project for literature researches with a table(in SQL Server 2014). I extract words from some books to comparison table. this table may have thousands of words. I write a query to compare words in different documents and want to show them side by side to show that each word frequency in different book. Write this:
SELECT c1.Word,c1.DocumentID,c1.WordCount,c2.DocumentID,c2.WordCount
FROM dbo.Comparation c1
JOIN dbo.Comparation c2 ON c2.Word = c1.Word AND c1.DocumentID <> c2.DocumentID
this query takes 15s on my machine. then i decided to show them page by page and write this:
SELECT c1.Word,c1.DocumentID,c1.WordCount,c2.DocumentID,c2.WordCount
FROM dbo.Comparation c1
JOIN dbo.Comparation c2 ON c2.Word = c1.Word AND c1.DocumentID <> c2.DocumentID
ORDER BY c1.DocumentID OFFSET 0 ROWS FETCH NEXT 15 ROWS ONLY;
now for 15 or little more records speed of retrieve was good, but my problem is: when offset is small, every things are OK but when i increase offset query runs slow. Is there any way to load and compare these words fast?
Thank you.
With best regards.
It looks like you are joining on the nvarchar column "Word". As explained here this has performance issues when used over a large number of rows.
Try maybe generating an extra "WordID" column which would be unique for each word and then joining on that.