I am working on MS SQL server.
I have a table called "User" with three columns and default index which is created with Primary Key of the table, of UserId.
I have a word file that contains user information line by line. There are almost 10000 lines.
I have a program that reads the user information from the word file and inserts it into the database. It is written with C# in visual studio. The program uses repository and unitofwork pattern.
The program workflow is as follows: 1) read a single line of user information from the word file. 2) create an user object based on the information 3) write the object to the repository 4) commit the work, in which the database insert statement executes.
Basically, the program executes "insert statement" each time it reads user information from the word file.
Here is my question.
I logged the time of each "insert statement", and I can see that the "insert statement" takes longer as there are more data inserted. Is this because the database has more data to sort out in B tree since a table has default clustered index on its primary key?
Please enlighten me what happens after and before the insert statement in SQL database.
Thank you Guys.
Is this because the database has more data to sort out in B tree since a table has default clustered index on its primary key?
No.In fact USERID autoincrement
as clustered index
is ideal choice for CI.
Since PK candidate is auto-incremented, data will be always appended in last page.
However in case of an Update statement page split can happen provided address is of large length than earlier.
If possible make Address to varchar
and as narrow as possible.
Main problem is very frequent insert, very frequent database hit.
If there are 1000 records to be inserted then create UDT
and insert create 50/100
at a time.You can so by applying Paging logic in insert method.It is easy and will be helpful.
Optimise your UI layer code like use Connection Pooling
, keep relevant data type and pass length of variable in DAL(Sql parameter)
.
I logged the time of each "insert statement", and I can see that the "insert statement" takes longer as there are more data inserted. Is this because the database has more data to sort out in B tree since a table has default clustered index on its primary key?
No since userid is ever increasing. No sorting work happen. May be there is fault in `insert sql script'. Main culprit is very frequent database hit.
Please enlighten me what happens after and before the insert statement in SQL database.
Please enlighten me what happens after and before the insert statement in SQL database.
Whenever data is inserted,insert take placein two places. At table level in Data pages ,and in index level.
Clustered index stores the actual data rows of the table at the leaf level of the index, in addition to controlling the sorting criteria of the data within the data pages and the order of the pages itself, based on the clustered index key.
Index Page split will happen. How ? Suppose There are 3 intermerdiate level and 4 leaf level. For example now if you insert 1 record,2 records nothing will happen.Insertion process at the stage will be fast.
Suppose you insert few more record (say after 10,20) then Intermediate lebel and leaf level will both increase.Becasue index page has space limit so when it will no
longer be able to accommodate new record then it will split page to accomadate new records.Becasue of this reason column length should be as narrarow as possible.
But in your case clustered index do not have to do Sorting criteriA.So one less work perform by Clusetered index.
Also Index page split cost will be less than non auto increment key or wide key.
since you are very frequently inserting records,it will impact your performance every now and then.
In case of Bulk insert Index page split will be less, so performance will improve.
In HEAP Table,since there is no clustered Index to maintain it has one less task to do.So very frequent Insert might improve.
But you have to decide Insert performance vs Select performance.
If this table is very frequntly use to fetch record then you hv to keep clustered index. If it very seldom use or record is less than 100 HEAP table is ok.
further reading,