Search code examples
nginxazure-container-apps

Nginx http call partial transfer in Azure Container App Environment


I have 2 docker containers: 1 nginx (repro-nginx) and 1 dotnet webapi that produces a "large response" (repro-large-response) (73.7 kB).

repro-nginx is deployed here: https://repro-nginx.lemonsea-adf71d13.westeurope.azurecontainerapps.io/ and repro-large-response is deployed here: https://repro-large-response.lemonsea-adf71d13.westeurope.azurecontainerapps.io/. Both are deployed in the same Container App Environment. This means that containers can talk to each other using their service name. Ideally repro-large-response is not exposed to the outside world, this is just for demo purposes.

repro-nginx is configured as follows: 3 routes on https://repro-nginx.lemonsea-adf71d13.westeurope.azurecontainerapps.io/:

  • /repro-internal/ => accessing repro-large-response using the internal service name.
  • /repro-external-http/ => accessing repro-large-response using the ingress via http.
  • /repro-external-https/ => accessing repro-large-response using the ingress via https.

The problem: Whenever you try the /repro-internal/ (https://repro-nginx.lemonsea-adf71d13.westeurope.azurecontainerapps.io/repro-internal/) it will sometimes give back a partial response, you'll see the total size is 65.5 kB versus 73.7 kB, the json data is cut off at some point and the error code is NS_ERROR_NET_PARTIAL_TRANSFER in some browsers. The other external endpoints don't have this problem.

We created a demo:

The images are hosted on docker hub, you can inspect them if needed:

repro-nginx configuration:

cors-options.conf

proxy_hide_header Access-Control-Allow-Origin;
add_header 'Access-Control-Allow-Origin' $allow_origin;
add_header Vary Origin;
add_header 'Access-Control-Allow-Credentials' 'true';
add_header 'Access-Control-Allow-Methods' 'GET, POST, PUT, DELETE, OPTIONS';
add_header 'Access-Control-Allow-Headers' 'Accept,Authorization,Cache-Control,Content-Type,DNT,If-Modified-Since,Keep-Alive,Origin,User-Agent,X-Requested-With';

cors-map.conf

map $http_origin $allow_origin {
  default "";
}

/etc/nginx/conf.d/default.conf

include snippets/cors-map.conf;

server { 
  listen 80;
#   server_name frontend;

#   error_log /home/logs/error.log debug;

  location / {
    root /usr/share/nginx/html;
    try_files $uri /index.html;
  }

  location /health {
    access_log off;
    error_log   off;
    add_header 'Content-Type' 'application/json';
    return 200 '{"status":"UP"}';
  }

  proxy_read_timeout 300;
  proxy_connect_timeout 300;
  proxy_send_timeout 300; 

  client_max_body_size 500M;
  client_body_buffer_size 500M;

  client_body_timeout 300;
  client_header_timeout 300;

  keepalive_timeout 300;

  proxy_buffer_size                            128k;
  proxy_buffers                                4 256k;
  proxy_busy_buffers_size                      256k;


  location /repro-internal/ {
    include snippets/cors-options.conf;

    proxy_ssl_server_name on;

    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Authorization "Bearer null";
    proxy_pass http://repro-large-response/;
    proxy_http_version 1.1;
  }

  location /repro-external-http/ {
    include snippets/cors-options.conf;

    proxy_ssl_server_name on;

    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Authorization "Bearer null";
    proxy_pass http://repro-large-response.lemonsea-adf71d13.westeurope.azurecontainerapps.io/;
    proxy_redirect off;
    proxy_http_version 1.1;
  }

  location /repro-external-https/ {
    include snippets/cors-options.conf;

    proxy_ssl_server_name on;

    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header Authorization "Bearer null";
    proxy_pass https://repro-large-response.lemonsea-adf71d13.westeurope.azurecontainerapps.io/;
    proxy_redirect off;
    proxy_http_version 1.1;
  }
}

The images are deployed using terraform, here are the relevant portions of the deployment plan:

// Container app - repro-large-response
resource "azurerm_container_app" "repro-large-response" {
  name                         = "repro-large-response"
  container_app_environment_id = azurerm_container_app_environment.main.id
  resource_group_name          = azurerm_resource_group.main.name
  revision_mode                = "Single"

  identity {
    type = "SystemAssigned"
  }

  ## Issue when deploying inmediately with ingress enabled 
  ingress {
    external_enabled = true
    target_port      = 8080
    allow_insecure_connections = true
    traffic_weight {
      percentage = 100
      latest_revision = true
    }
  }

  template {
    min_replicas = 1
    max_replicas = 1

    container {
      name   = "repro-large-response"
      image  = "techorama2021cegeka/repro-large-response:3"
      cpu    = 0.25
      memory = "0.5Gi"
    }
  }
  
  tags = {
    "container" = "repro-large-response"
  }
}

// Container app - repro-nginx
resource "azurerm_container_app" "repro-nginx" {
  name                         = "repro-nginx"
  container_app_environment_id = azurerm_container_app_environment.main.id
  resource_group_name          = azurerm_resource_group.main.name
  revision_mode                = "Single"

  identity {
    type = "SystemAssigned"
  }

  ## Issue when deploying inmediately with ingress enabled 
  ingress {
    external_enabled = true
    target_port      = 80
    allow_insecure_connections = false
    traffic_weight {
      percentage = 100
      latest_revision = true
    }
  }

  template {
    min_replicas = 1
    max_replicas = 1

    container {
      name   = "repro-nginx"
      image  = "techorama2021cegeka/repro-nginx:4"
      cpu    = 0.25
      memory = "0.5Gi"
    }
  }
  
  tags = {
    "container" = "repro-nginx"
  }
}

If you need more information, please let me know.


Solution

  • After talking with Azure support we found a potential solution:

      location /repro-internal/ {
        include snippets/cors-options.conf;
     
        proxy_ssl_server_name on;
     
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header Authorization "Bearer null";
        **proxy_set_header Connection "";**
        proxy_pass http://repro-large-response/;
        proxy_http_version 1.1;
      }
    

    This has to do with the Envoy component they use, according to them: "it has been discovered that issue is related to Ingress component “envoy”, by adding “Connection:Close” header to the request. we suggest that you could configure the nginx to remove the header “Connection:Close” when forwarding the request "