I am running a TFS nightly build that for the last few days has not been able to complete all its tests. It fails after several hours with a "Test run is aborted" message. Previous to this the tests always ran successfully, and no major changes(or even minor) have been made to the system that runs these tests.
Information:
When I look in the TFS log for the latest run it lists the following(2017-04-11T06:42:47.5500707Z):
[warning]DistributedTests: Test run is aborted. Logging details of the run logs. [warning]DistributedTests: New test run created.
[warning]Test Run queued for Project Collection Build Service
[warning]DistributedTests: Test discovery started.
[warning]DistributedTests: Test Run Discovery Completed . Test run id: 533
[warning]DistributedTests: 290 test cases discovered.
[warning]DistributedTests: Test execution started. Test run id : 533
[warning]DistributedTests: Test run timed out. Test run id : 533
[warning]DistributedTests: Test run aborted. Test run id: 533
[error]The test run was aborted, failing the task.
When I look at the run log(worker_20170410-234426-utc_864.log) I see:
06:42:47.659516 BaseLogger.LogConsoleMessage(scope.JobId = 7ced7f31-e360-47f3-b334-ef20faeaf000, message = ##[error]The test run was aborted, failing the task.) 06:42:47.659516 Microsoft.TeamFoundation.DistributedTask.Agent.Common.AgentExecutionTerminationException: PowerShell script completed with errors. at Microsoft.TeamFoundation.DistributedTask.Handlers.PowerShellHandler.Execute(ITaskContext context, CancellationToken cancellationToken) at Microsoft.TeamFoundation.DistributedTask.Worker.JobRunner.RunTask(ITaskContext context, TaskWrapper task, CancellationTokenSource tokenSource)
In the test log, I don't see any errors in the VS, just a warning about not able to connect(I see these often):
W, 2060, 5, 2017/04/10, 16:26:03.595, XXXTESTING\QTController.exe, Test of LoadTestResultConnectString failed: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)
I also see an error thrown in the Application Event log at the same time:
The description for Event ID 0 from source Application cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
Error Handler Exception: System.ServiceModel.CommunicationException: There was an error reading from the pipe: The pipe has been ended. (109, 0x6d). ---> System.IO.IOException: The read operation failed, see inner exception. ---> System.ServiceModel.CommunicationException: There was an error reading from the pipe: The pipe has been ended. (109, 0x6d). ---> System.IO.PipeException: There was an error reading from the pipe: The pipe has been ended. (109, 0x6d).....
the message resource is present but the message is not found in the string/message table
The issue is that I really don't know how to interpret these messages, each log just says "test run was aborted, failing the task", I'm not even certain the powershell issue is what caused it. I'm also not sure that the error thrown in the application log is related, though it was thrown at exactly the same time that the run failed.
It's also difficult to research this issue, when you really don't know what's causing the test agent to fail. There are posts related to VS, and to the TFS Test Agent, but these don't strike me as related issues, and of course there is this somewhat unhelpful post about the Powershell message.
Has anyone seen this sort of issue before? I don't think anything on my build server has changed over the last few days(maybe updates...), what do you think would cause an issue like this to occur?
If you look at the failed build(containing tests) after it is aborted in the "Build" section of TFS, its says it was "Aborted", that's it... If you look at results of the build(containing tests) in the "Test" section of TFS it specified that the test run "Exceeded Timeout".
Apparently MSTest was running up against the default value of this little gem. I think it defaults to 8 hours when not specified, but I'm not too sure about this. Anyways I set the following setting in my "Default.testsettings" file:
<?xml version="1.0" encoding="utf-8"?>
<TestSettings name="TestSettings1">
<Execution>
<Timeouts runTimeout="200000000" />
</Execution>
</TestSettings>
Seems to resolve the issue. Tests runs successfully and no longer time out.