We migrated our app (Java, SpringBoot) from Heroku to GCP Cloud Run, after facing some performance-related issues I modified our REST endpoint to use a non-blocking way of serving requests and encountered a weird problem.
When the non-blocking change was deployed to the CloudRun all related endpoints return 0 bytes response (reported by any browser or cURL), even though CloudRun logs report proper response size:
responseSize: "1137"
Things I've tried so far without success:
It works fine locally or from a docker container but returns 0 bytes responses on the CloudRun.
What could be the reason? What can I check on my side while waiting for the support to deal with the issue?
UPDATE:
Result of calling cURL -L -i -v
:
Trying 34.36.35.1:443...
Connected to ***** (34.36.35.1) port 443 (#0)
ALPN, offering h2
ALPN, offering http/1.1
successfully set certificate verify locations:
CAfile: /etc/ssl/cert.pem
CApath: none
(304) (OUT), TLS handshake, Client hello (1):
(304) (IN), TLS handshake, Server hello (2):
(304) (IN), TLS handshake, Unknown (8):
(304) (IN), TLS handshake, Certificate (11):
(304) (IN), TLS handshake, CERT verify (15):
(304) (IN), TLS handshake, Finished (20):
(304) (OUT), TLS handshake, Finished (20):
SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256
ALPN, server accepted to use h2
Server certificate:
subject: CN=***********
start date: Jun 15 15:58:36 2023 GMT
expire date: Sep 13 16:52:50 2023 GMT
subjectAltName: host "" matched cert's ""
issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1D4
SSL certificate verify ok.
Using HTTP2, server supports multiplexing
Connection state changed (HTTP/2 confirmed)
Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
Using Stream ID: 1 (easy handle 0x12c00e000)
GET /api/user/permissions HTTP/2
Host: *************
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/114.0
some irrelevant headers, like user agent, etc. *
< HTTP/2 200 HTTP/2 200 < vary: Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding,
User-Agent vary:
Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding,
User-Agent < cache-control: no-cache, must-revalidate, no-transform,
no-store cache-control: no-cache, must-revalidate, no-transform,
no-store < x-content-type-options: nosniff x-content-type-options:
nosniff < x-xss-protection: 1; mode=block x-xss-protection: 1; mode=block < referrer-policy: origin-when-cross-origin
referrer-policy: origin-when-cross-origin
content-type: application/json content-type: application/json
x-cloud-trace-context: 1dfe301ceac344ebb4b30eed4c89a495;o=1
x-cloud-trace-context: 1dfe301ceac344ebb4b30eed4c89a495;o=1
< date: Wed, 05 Jul 2023 16:02:11 GMT date: Wed, 05 Jul 2023 16:02:11 GMT
server: Google Frontend server: Google Frontend
content-length: 0
content-length: 0 < via: 1.1 google via: 1.1 google <
strict-transport-security: 31536000 strict-transport-security: 31536000 < alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
Connection #0 to host ************ left intact
I managed to solve the problem and post an answer for anyone facing a similar issue. It turned out that the problem was unrelated to the CloudRun, the responseSize reported by CloudRun logs misled me, in reality, the responseBody was 0 bytes and the size reported by CR apparently was the size of headers.
To root cause of the problem was a filter forcibly flushing the response right after delegating the processing to another thread, before the future was completed. It was configured to run only in production - that's why I was not able to reproduce it locally.