I've created a function that:
I'm getting connection errors about 5% of the time that seem to reference python site-packages vs my actual code. How can I continue to debug this issue?
I added retries around every step of reading from cloud storage but this failure seems to occur before my code even begins running. Alternately, logs aren't making it to stackdriver?
Here is the full stack trace. I don't see where any of it references lines in my code.
Function execution started
AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x7ea453f9e780>" raised exception!
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/opt/python3.7/lib/python3.7/http/client.py", line 977, in send
self.sock.sendall(data)
ConnectionResetError: [Errno 104] Connection reset by peer
None
During handling of the above exception, another exception occurred:
None
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 720, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/env/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 400, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/env/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 734, in reraise
raise value.with_traceback(tb)
File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1229, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1275, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1224, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/python3.7/lib/python3.7/http/client.py", line 1016, in _send_output
self.send(msg)
File "/opt/python3.7/lib/python3.7/http/client.py", line 977, in send
self.sock.sendall(data)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
During handling of the above exception, another exception occurred:
None
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 123, in __call__
method, url, data=body, headers=headers, timeout=timeout, **kwargs
File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
The above exception was the direct cause of the following exception:
None
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 96, in refresh
self._retrieve_info(request)
File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 77, in _retrieve_info
request, service_account=self._service_account_email
File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/_metadata.py", line 200, in get_service_account_info
recursive=True,
File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/_metadata.py", line 132, in get
response = request(url=url, method="GET", headers=_METADATA_HEADERS)
File "/env/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 128, in __call__
six.raise_from(new_exc, caught_exc)
File "<string>", line 3, in raise_from
google.auth.exceptions.TransportError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
None
The above exception was the direct cause of the following exception:
None
Traceback (most recent call last):
File "/env/local/lib/python3.7/site-packages/grpc/_plugin_wrapping.py", line 79, in __call__
callback_state, callback))
File "/env/local/lib/python3.7/site-packages/google/auth/transport/grpc.py", line 77, in __call__
callback(self._get_authorization_headers(context), None)
File "/env/local/lib/python3.7/site-packages/google/auth/transport/grpc.py", line 64, in _get_authorization_headers
self._request, context.method_name, context.service_url, headers
File "/env/local/lib/python3.7/site-packages/google/auth/credentials.py", line 124, in before_request
self.refresh(request)
File "/env/local/lib/python3.7/site-packages/google/auth/compute_engine/credentials.py", line 102, in refresh
six.raise_from(new_exc, caught_exc)
File "<string>", line 3, in raise_from
google.auth.exceptions.RefreshError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
I thought the issue was with blob.download_as_string()
getting a connection error.
However, after deploying a simplified version of the function I cannot recreate the error.
This thread says to add ConnectionResetError and ProtocolError as exceptions that will also be retried.
from urllib3.exceptions import ProtocolError
from google.api_core import retry
predicate = retry.if_exception_type(
ConnectionResetError, ProtocolError)
reset_retry = retry.Retry(predicate)
data = reset_retry(blob.download_as_string)()
I wish I knew why this connection error happens so often.
I've discovered the cause of this intermittent error.
GCP best practices says to instantiate client connections in your main.py
outside of main()
. These only execute on instance cold starts.
For example:
[main.py] - instantiates clients only during cold start
import builtins
from google.cloud import storage
from google.cloud import pubsub_v1
from google.cloud import logging as cloudlogging
# Create global clients to avoid unneeded network activity!
builtins.pubsub_client = pubsub_v1.PublisherClient()
builtins.storage_client = storage.Client()
builtins.log_client = cloudlogging.Client()
[other_func.py] - uses clients
bucket = storage_client.create_bucket(bucket_name)