So sum it up quickly I have a sqlServer
database (using SSMS to work with it) and it has a table with order_num column and then a description column.
e.g 16548221587 | Small hairbrush.
I have index on order_num column.
I then have a VB.net app which basically i want it to allow the user to put a .txt file with a massive list of order_nums (>150,000, 1 per line) and what it does is reads these line by line and then search's the database, adds it all to a temp table then streamwrites
it to a "Results" .txt file.
In regards to the title of this question i ask it because my code which i'll post below, works! and i'm clocking it at reading and find and inserting each find into the temp table at .427 seconds a search but put this with 150,000 records looking at it taking over 16 hours! So that's what i'm wondering am i doing this a topsy turvy way or am i expecting way too much of reading/finding and retrieving that many records and expect it to go quicker?
If System.IO.File.Exists(TextBox2.Text) Then
'read in file list of order numbers to search
result = From n In System.IO.File.ReadLines(TextBox2.Text)
Where n.Length = 13
Select n.Substring(0, 13)
Else
MessageBox.Show("The file path you entered seems to be invalid. Please try again.")
Exit Sub
End If
For Each word As String In result
Dim cmd As New SqlCommand("dbo.OrdersToTemp", con)
cmd.CommandType = CommandType.StoredProcedure
cmd.Parameters.Add("@OrderNum", SqlDbType.NVarChar).Value = word.ToString()
cmd.CommandTimeout = 3000
cmd.ExecuteNonQuery()
Next
Using sw As New StreamWriter(TextBox2.Text.Substring(0, TextBox2.TextLength - 4) + "-results.txt")
Dim retrieveResults As New SqlCommand("dbo.GetResults", con)
retrieveResults.CommandType = CommandType.StoredProcedure
retrieveResults.CommandTimeout = 3000
Using RDR = retrieveResults.ExecuteReader
Do While RDR.Read
OrderDescription = RDR("Description").ToString()
sw.WriteLine(OrderDescription )
Loop
End Using
End Using
UPDATE
I have taken some the advice on this and now i sqlbulkcopy the order_nums that need to be searched into a Temp table, this is done quite fast. I am then using a query such as
SELECT o.order_num, description
from OrderTable o
join TempOrderIdTable t on t.order_num = o.order_num
But it still seems quite slow getting even just 170 results takes like 30 seconds which in my eyes is quite slow. I put a clustered index on the order_num in the order_table and NO index on the temp table which is basically just the .txt file except in sql table
UPDATE 2
So like i said i now have a non-clustered index (orderNo include description) on OrderTable and a clusterd index(Order_num) on the TempTable BUT any sort of join or cross apply etc. takes still over 33 seconds to basically join 100 OrderNum and return just 170 which is still so slow. here are the joins I am trying:
select o.Order_num, t.description
from Temp_data o
join OrderTable on t.Order_num= o.Order_num
select x.Order_num, x.description
from OrderTable x
where Order_num in (select Order_num from Temp_data)
select x.Order_num,x.description
from OrderTable x
cross apply (select o.Order_num from Temp_data o where o.Order_num= x.Order_num) ord
SOLVED So it was me being an idiot for the final bit and you were all correct basically when i was making the temp table I accidentally made the Column a nvarchar whereas in the actual OrderTable it was just a varchar column for order_num. Sorry about that guys me being half asleep!
The issue with your code is that you execute this SQL command 150k times. That's never going to work (fast).
What you can do is read the 150k values from the file first, then insert using SqlBulkCopy
into a table, for example as shown in this SO answer. Basically do what your dbo.OrdersToTemp
procedure does, in your vb.net code - all at once instead of one row at a time. This should take a couple of seconds at most, given latency for your current query.
For the following query:
SELECT o.order_num, description
from OrderTable o
join TempOrderIdTable t on t.order_num = o.order_num
I'm assuming OrderTable
can contain multiple records per order_num
(you mentioned returning 170 rows for 100 orders), you could use indexes as such:
CREATE INDEX ix ON OrderTable (order_num) INCLUDE (description)
If the order numbers you have in the file are unique (and you can ensure uniqueness):
CREATE UNIQUE CLUSTERED INDEX ux ON TempOrderIdTable (order_num);
If your SQL Server edition supports it, you could compress the index with WITH (DATA_COMPRESSION = PAGE)
, but this requires an Enterprise license (or Developer, but you can't use that in prod environment).