It is a fairly odd behavior coming from an unknown part of the application.
My setup:
The application will work for a seemingly random amount of time (usually 2-3 minutes), then will stop responding and so will the SQL Server. Other applications even with other accounts are unable to do any request to the database.
The number of explicit requests on my side is 1 in the ready() of the django application to get some inital data.
def ready(self):
from django.conf import settings
from app.models import SomeModel
try:
settings.SomeModel_ID = SomeModel.objects.filter(identifier=settings.SomeIdentifier)[0].pk
except:
settings.SomeModel_ID = SomeModel.objects.create(identifier=settings.SomeIdentifier).pk
SQL Server Tracer will log some requests but nothing unusual (quite a lot of BatchStarted/BatchFinished).
Wireshark with see an insane amount of packets moving between the application and the database (We are talking +250 for a simple SELECT). Here I took an example with some TCP but +95% for the packets are TDS.
5422 36.248815392 10.10.10.66 -> 10.10.10.103 TDS 183 TLS exchange
5423 36.249013989 10.10.10.103 -> 10.10.10.66 TDS 135 TLS exchange
5424 36.249427950 10.10.10.66 -> 10.10.10.103 TDS 135 TLS exchange
5425 36.250678349 10.10.10.103 -> 10.10.10.66 TCP 1514 [TCP segment of a reassembled PDU]
5426 36.250703683 10.10.10.103 -> 10.10.10.66 TDS 607 TLS exchange
5427 36.250856893 10.10.10.66 -> 10.10.10.103 TCP 66 1433 → 39348 [ACK]
Seq=2816 Ack=5937 Win=131584 Len=0 TSval=148074754 TSecr=605610420
5428 36.253444263 10.10.10.66 -> 10.10.10.103 TDS 4215 TLS exchange
5429 36.253462203 10.10.10.103 -> 10.10.10.66 TCP 66 39348 → 1433 [ACK]
Seq=5937 Ack=6965 Win=45440 Len=0 TSval=605610421 TSecr=148074754
5430 36.255551301 10.10.10.66 -> 10.10.10.103 TDS 4215 TLS exchange
5431 36.255572551 10.10.10.103 -> 10.10.10.66 TCP 66 39348 → 1433 [ACK]
I know the application is the cause of the DoS because shutting it down restores the access to the DB for everyone instantly.
When listing active connection using:
SELECT DB_NAME(dbid) AS DBName,
COUNT(dbid) AS NumberOfConnections, loginame
FROM sys.sysprocesses
GROUP BY dbid, loginame
ORDER BY DB_NAME(dbid)
Only one connection is listed.
There is no loop of any kind that would offer an easy explanation for this.
We found that the problem occured when using a Pool, regardless of if we used the db connection inside the function, creating multiple child processed was causing the problem. To solve it, simply connection.close()
before forking the process.