Search code examples
google-cloud-platformgoogle-cloud-run

CloudRun returns 0 bytes response


We migrated our app (Java, SpringBoot) from Heroku to GCP Cloud Run, after facing some performance-related issues I modified our REST endpoint to use a non-blocking way of serving requests and encountered a weird problem.

When the non-blocking change was deployed to the CloudRun all related endpoints return 0 bytes response (reported by any browser or cURL), even though CloudRun logs report proper response size:

responseSize: "1137"

Things I've tried so far without success:

  • to exclude LoadBalancer out of the equation I enabled public access and tried to hit CloudRun directly
  • to make sure it is not related to Tomcat I replaced it with Jetty
  • I tried both generations of CloudRun instances

It works fine locally or from a docker container but returns 0 bytes responses on the CloudRun.

What could be the reason? What can I check on my side while waiting for the support to deal with the issue?

UPDATE: Result of calling cURL -L -i -v:

  • Trying 34.36.35.1:443...

  • Connected to ***** (34.36.35.1) port 443 (#0)

  • ALPN, offering h2

  • ALPN, offering http/1.1

  • successfully set certificate verify locations:

  • CAfile: /etc/ssl/cert.pem

  • CApath: none

  • (304) (OUT), TLS handshake, Client hello (1):

  • (304) (IN), TLS handshake, Server hello (2):

  • (304) (IN), TLS handshake, Unknown (8):

  • (304) (IN), TLS handshake, Certificate (11):

  • (304) (IN), TLS handshake, CERT verify (15):

  • (304) (IN), TLS handshake, Finished (20):

  • (304) (OUT), TLS handshake, Finished (20):

  • SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256

  • ALPN, server accepted to use h2

  • Server certificate:

  • subject: CN=***********

  • start date: Jun 15 15:58:36 2023 GMT

  • expire date: Sep 13 16:52:50 2023 GMT

  • subjectAltName: host "" matched cert's ""

  • issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1D4

  • SSL certificate verify ok.

  • Using HTTP2, server supports multiplexing

  • Connection state changed (HTTP/2 confirmed)

  • Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0

  • Using Stream ID: 1 (easy handle 0x12c00e000)

  • GET /api/user/permissions HTTP/2

  • Host: *************

  • user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/114.0

  • some irrelevant headers, like user agent, etc. *

  • < HTTP/2 200 HTTP/2 200 < vary: Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding,

  • User-Agent vary:

  • Origin,Access-Control-Request-Method,Access-Control-Request-Headers,Accept-Encoding,

  • User-Agent < cache-control: no-cache, must-revalidate, no-transform,

  • no-store cache-control: no-cache, must-revalidate, no-transform,

  • no-store < x-content-type-options: nosniff x-content-type-options:

  • nosniff < x-xss-protection: 1; mode=block x-xss-protection: 1; mode=block < referrer-policy: origin-when-cross-origin

  • referrer-policy: origin-when-cross-origin

  • content-type: application/json content-type: application/json

  • x-cloud-trace-context: 1dfe301ceac344ebb4b30eed4c89a495;o=1

  • x-cloud-trace-context: 1dfe301ceac344ebb4b30eed4c89a495;o=1

  • < date: Wed, 05 Jul 2023 16:02:11 GMT date: Wed, 05 Jul 2023 16:02:11 GMT

  • server: Google Frontend server: Google Frontend

  • content-length: 0

  • content-length: 0 < via: 1.1 google via: 1.1 google <

  • strict-transport-security: 31536000 strict-transport-security: 31536000 < alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000 alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

  • Connection #0 to host ************ left intact


Solution

  • I managed to solve the problem and post an answer for anyone facing a similar issue. It turned out that the problem was unrelated to the CloudRun, the responseSize reported by CloudRun logs misled me, in reality, the responseBody was 0 bytes and the size reported by CR apparently was the size of headers.

    To root cause of the problem was a filter forcibly flushing the response right after delegating the processing to another thread, before the future was completed. It was configured to run only in production - that's why I was not able to reproduce it locally.