I have Apache Superset and Apache2 server located on the same EC2 instance. Apache2 is acting as a proxy server. It accepts HTTPS requests and transfers them to Apache Superset. Apache Superset is run using gunicorn
.
Requests to Apache Dremio data engine could take some time (< 60 seconds). When accessing dashboards on Superset, using DNS name with SSL, with proxy setup some dashboards parts (requests) are failing with the following error:
Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request
Reason: Error reading from remote server
Strangely, these errors can appear in a matter of seconds despite that default value for ProxyTimeout
is quite high.
The problem doesn't occur if Superset is accessed by IP address.
Error message in apache2/error.log
:
(20014) Internal error (specific information not available): [client 10.4.26.3:6969] AH01102: error reading status line from remote server localhost:8088, referer: ...
Problem can be with proxy server timeout or with Superset web server dropping some connections. My Apache2 config:
<VirtualHost *:443>
ProxyPreserveHost On
ProxyRequests Off
ServerName dash.domain.com
ServerAlias dash.domain.com
SSLEngine on
SSLCertificateFile /etc/ssl/private/cert.crt
SSLCertificateChainFile /etc/ssl/certs/cert2.crt
SSLCertificateKeyFile /etc/ssl/private/key.key
ProxyPass / http://localhost:8088/ connectiontimeout=3600 timeout=3600
ProxyPassReverse / http://localhost:8088/
# things tried
# SetEnv force-proxy-request-1.0 1
# SetEnv proxy-nokeepalive 1
# SetEnv proxy-initial-not-pooled 1
# ProxyTimeout 3600
# TimeOut 3600
</VirtualHost>
Things tested (and not working):
Timeout
and ProxyTimeout
connectiontimeout
and timeout
(as seen above)Keepalive=On
for ProxyPasssuperset_config.py
-> ENABLE_PROXY_FIX, SUPERSET_WEBSERVER_TIMEOUTIn addition, similar proxy setup was build using nginx
, error is similar to what is described here.
Any help or ideas would be appreciated. Thank you very much!
Apache Superset version: 0.37.2
Apache Dremio version: 4.1.0
Apache2 server version: 2.4.29
EC2 instance type: t3.medium
OS version: Ubuntu 18.04
The problem was in dying gunicorn async workers. Too many requests were coming from the charts and workers were not able to handle them. Changing worker type from async to sync (default gunicorn type) solved the proxy problem.
I still don't know why direct access by IP was not producing the 502 proxy error.
Sorry for not including information about gunicorn
in the question.
P.S Recommended type of workers for Apache Superset from their docs is async, but, for my case, sync were the better solution. In theory, sync workers are slower compare to async (in Superset context).