Our Sharepoint 2010 Crawl database has suddenly started using any available space for its log file (.ldf), any additional space that we add gets used within an hour or so (e.g. 10Gb disappeared within minutes yesterday).
From SharePoint Central Admin we can see that no crawl is running (status is 'idle' and the 'Last crawl completed' field is populated).
Using SQL Management Studio we can see that we have a long running transaction that is calling the 'proc_MSS_CrawlReportPreprocessChanges' stored procedure which is described in this link (in the 'SharePoint 2010/2013 and capacity planning for TempDB ...' section):-
http://sharepoint.it-professional.co.uk/
-so 'proc_MSS_CrawlReportPreprocessChanges' uses a cursor hence lots of TempDB action.
This would explain our problem IF THE CRAWL WAS RUNNING but the crawl has finished.
So my main question is, what is causing the 'proc_MSS_CrawlReportPreprocessChanges' procedure to be run and how can we stop it ?
Help please !
The problem seems to be related to crawl REPORTING rather than the running of the crawl itself.
There seem to be two relevant SharePoint services:-
both of these services are using the 'MSSCrawlUrlChanges' table which currently has over 65 million records.
The 'cleanup' service above calls stored procedure 'proc_MSS_CrawlReportCleanup' passing in a parameter for the number of days over which data should be deleted (I can't find where the parameter value is configured in Sharepoint) - not sure how but clearly this service / procedure is not clearing up the table as expected.
So in order to clear out this table I've stopped the two services and manually run the procedure, reducing the parameter value each time to clear out several million records each time:-
Clearly this is only a temporary solution but it has stopped us running out of disk space every hour or so and kept the system running. I now need to try and determine why this happened and why the 'cleanup' service doesn't seem to have been working, plus using David's advice to get the crawl back on track.
Better way to clear the crawl logs using Powershell to set the cleanup interval instead of calling stored procedure via SQL:-
//use this to get ID of Search Service Application
Get-SPServiceApplication | Where {$_.TypeName -eq "Search Service Application"}
THEN:-
//use ID to get search app
$searchApp = Get-SPServiceApplication | Where {$_.Id -eq "a21c3f70-9487-471e-a7ad-b80259c90ff7"}
//output cleanup interval
$searchApp.CrawlLogCleanUpIntervalInDays
//set interval to 30 (was 90)
$searchApp.CrawlLogCleanUpIntervalInDays = 30
$searchApp.Update()
Can now run the 'Crawl Log Cleanup for Search Application Search Service Application' task from SharePoint Central Admin which will pickup the new interval (if large number of records in the 'MSSCrawlUrlChanges' table then may need to start with larger number than 30 days and repeat in manageable chunks e.g. 300, 250, 200, etc).
I'm now hoping that the smaller cleanup interval will enable me to schedule the services again but will monitor for a while to ensure problem doesn't re-occur.
I have never seen anything like this before, but what I would suggest you do is stop the search service entirely in SharePoint. Once shutdown, force the cancellation of the stored procedure in SQL if it is still running.
Restart the service and start a full crawl.
If the problem occurs again, repeat above, but this time before starting a full crawl, delete and recreate your content sources and then start another full crawl.
If still occurring my suggestion may be into the drastic, but you may have to consider is a index reset. Here's the thing though in case you didn't know, if you reset your index you will lose ALL of your analytics. All the things that Search has learned from your users searching habits will be lost. Any reporting you do from searching (top documents, top search words, etc) will be lost. It will have to rebuild. Depending on how long you have had search and how much your users have used it or if you have reports built off of it, can affect the impact so the decision is yours.
The final step I would take would be to delete your entire search service and recreate it (including new databases).
If you have Microsoft Premier support give them a call too.
My suggestions may seem drastic, but Search may make your system completely unusable soon if this continues. If this was happening in my farm and I couldn't find a cause or someone else didn't have a solution I haven't thought of, then I would be doing these steps myself.
Hope it helps.