I am writing a Flask API, and am seeing a lot of failures when load testing.
Looking at the uwsgi logs, I am seeing something which looks a little nasty, which is:
cx_Oracle.DatabaseError: ORA-12592: TNS:bad packet
The oracle connection is working, as I am not seeing a complete failure, but this does seem to be what is terminating the http rest call prematurely in most cases.
What is causing this? I am using RHEL, with cx_Oracle 7.23, connecting to 12C database. I am using the Oracle thin client.
Exception on /api/read/maa [GET]
Traceback (most recent call last):
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/ariel/anaconda3/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/ariel/anaconda3/lib/python3.7/site-packages/connexion/decorators/decorator.py", line 48, in wrapper
response = function(request)
File "/ariel/anaconda3/lib/python3.7/site-packages/connexion/decorators/uri_parsing.py", line 144, in wrapper
response = function(request)
File "/ariel/anaconda3/lib/python3.7/site-packages/connexion/decorators/validation.py", line 384, in wrapper
return function(request)
File "/ariel/anaconda3/lib/python3.7/site-packages/connexion/decorators/parameter.py", line 121, in wrapper
return function(**kwargs)
File "./registrations.py", line 58, in read_maa_non_passive
for row in cursor_ariel.fetchall():
cx_Oracle.DatabaseError: ORA-12592: TNS:bad packet
Getting data and status code
----UPDATE---------
All my problems went away when I stopped connection pooling in cx_Oracle. I originally had a single connection to oracle shared across the Flask application. This have me failures in stress testing. So I tried to be clever and use SessionPooling and acquire connections and release them at each service call. Finally I went back to "bad practice" and create a completely new connection to Oracle for every single function call (api endpoint), and I now get 100% success rate across stress testing in Locust, even for the larger response calls which are 30mb json payloads.
For people who don't/haven't read the comment threads, the solution was to start the pool with threaded=True
.