Search code examples
sqlsql-servervb.netstreamreaderstreamwriter

Am I being impatient or not Efficient?


So sum it up quickly I have a sqlServer database (using SSMS to work with it) and it has a table with order_num column and then a description column. e.g 16548221587 | Small hairbrush. I have index on order_num column.

I then have a VB.net app which basically i want it to allow the user to put a .txt file with a massive list of order_nums (>150,000, 1 per line) and what it does is reads these line by line and then search's the database, adds it all to a temp table then streamwrites it to a "Results" .txt file.

In regards to the title of this question i ask it because my code which i'll post below, works! and i'm clocking it at reading and find and inserting each find into the temp table at .427 seconds a search but put this with 150,000 records looking at it taking over 16 hours! So that's what i'm wondering am i doing this a topsy turvy way or am i expecting way too much of reading/finding and retrieving that many records and expect it to go quicker?

If System.IO.File.Exists(TextBox2.Text) Then
                    'read in file list of order numbers to search
                    result = From n In System.IO.File.ReadLines(TextBox2.Text)
                             Where n.Length = 13
                             Select n.Substring(0, 13)
                Else
                    MessageBox.Show("The file path you entered seems to be invalid. Please try again.")
                    Exit Sub
                End If



                For Each word As String In result

                    Dim cmd As New SqlCommand("dbo.OrdersToTemp", con)
                    cmd.CommandType = CommandType.StoredProcedure
                    cmd.Parameters.Add("@OrderNum", SqlDbType.NVarChar).Value = word.ToString()
                    cmd.CommandTimeout = 3000
                    cmd.ExecuteNonQuery()

                Next




                Using sw As New StreamWriter(TextBox2.Text.Substring(0, TextBox2.TextLength - 4) + "-results.txt")


                    Dim retrieveResults As New SqlCommand("dbo.GetResults", con)
                    retrieveResults.CommandType = CommandType.StoredProcedure
                    retrieveResults.CommandTimeout = 3000

                    Using RDR = retrieveResults.ExecuteReader

                        Do While RDR.Read



                            OrderDescription = RDR("Description").ToString()

                            sw.WriteLine(OrderDescription )

                        Loop

                    End Using




                End Using

UPDATE

I have taken some the advice on this and now i sqlbulkcopy the order_nums that need to be searched into a Temp table, this is done quite fast. I am then using a query such as

SELECT o.order_num, description
from OrderTable o
join TempOrderIdTable t on t.order_num = o.order_num

But it still seems quite slow getting even just 170 results takes like 30 seconds which in my eyes is quite slow. I put a clustered index on the order_num in the order_table and NO index on the temp table which is basically just the .txt file except in sql table

UPDATE 2

So like i said i now have a non-clustered index (orderNo include description) on OrderTable and a clusterd index(Order_num) on the TempTable BUT any sort of join or cross apply etc. takes still over 33 seconds to basically join 100 OrderNum and return just 170 which is still so slow. here are the joins I am trying:

select o.Order_num, t.description
from Temp_data o
join OrderTable on t.Order_num= o.Order_num


select x.Order_num, x.description
from OrderTable x
where Order_num in (select Order_num from Temp_data)


select x.Order_num,x.description
from OrderTable x
cross apply (select o.Order_num from Temp_data o where o.Order_num= x.Order_num) ord

SOLVED So it was me being an idiot for the final bit and you were all correct basically when i was making the temp table I accidentally made the Column a nvarchar whereas in the actual OrderTable it was just a varchar column for order_num. Sorry about that guys me being half asleep!


Solution

  • The issue with your code is that you execute this SQL command 150k times. That's never going to work (fast).

    What you can do is read the 150k values from the file first, then insert using SqlBulkCopy into a table, for example as shown in this SO answer. Basically do what your dbo.OrdersToTemp procedure does, in your vb.net code - all at once instead of one row at a time. This should take a couple of seconds at most, given latency for your current query.

    For the following query:

    SELECT o.order_num, description
    from OrderTable o
    join TempOrderIdTable t on t.order_num = o.order_num
    

    I'm assuming OrderTable can contain multiple records per order_num (you mentioned returning 170 rows for 100 orders), you could use indexes as such:

    CREATE INDEX ix ON OrderTable (order_num) INCLUDE (description)
    

    If the order numbers you have in the file are unique (and you can ensure uniqueness):

    CREATE UNIQUE CLUSTERED INDEX ux ON TempOrderIdTable (order_num);
    

    If your SQL Server edition supports it, you could compress the index with WITH (DATA_COMPRESSION = PAGE), but this requires an Enterprise license (or Developer, but you can't use that in prod environment).