Search code examples
djangoopenshiftconnection-timeout

Openshift app not responding


I have a work in progress django app hosted on openshift, at some point over the last few days it has stopped responding entirely in the browser.

I can still ssh into it, I can deploy to it, and the openshift.redhat.com interface isn't reporting any errors with the app, but requesting any page through the browser just doesn't give any response. Browser just keeps saying "Waiting for appname-user.rhcloud.com".

The logs might provide some insight.

rhc tail appname
==> app-root/logs/python.log <==
Unable to open logs
(98)Address already in use: make_sock: could not bind to address 127.8.172.129:8080
no listening sockets available, shutting down
Unable to open logs
(98)Address already in use: make_sock: could not bind to address 127.8.172.129:8080
no listening sockets available, shutting down
Unable to open logs
... (ad infinitum)

I don't think it stopped responding after a deploy, it was working, then it wasn't. No idea what happened.

Would appreciate any ideas to what could cause this or how to debug / resolve it.


EDIT

Based on fat fantasma's suggestion I tried a force-stop/start cycle and this what was logged:

$ rhc app force-stop -a notebook
$ rhc app start -a notebook

$ rhc tail notebook
==> app-root/logs/python.log <==
(98)Address already in use: make_sock: could not bind to address 127.8.172.129:8080
no listening sockets available, shutting down
Unable to open logs
(98)Address already in use: make_sock: could not bind to address 127.8.172.129:8080
no listening sockets available, shutting down
Unable to open logs
[Sun Nov 16 01:26:13 2014] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:openshift_t:s0:c4,c359
[Sun Nov 16 01:26:13 2014] [notice] Digest: generating secret for digest authentication ...
[Sun Nov 16 01:26:13 2014] [notice] Digest: done
[Sun Nov 16 01:26:13 2014] [notice] Apache/2.2.15 (Unix) mod_wsgi/3.4 Python/2.7.5 configured -- resuming normal operations

==> app-root/logs/postgresql.log <==
2014-11-16 06:26:05 GMT LOG:  trying another address for the statistics collector
2014-11-16 06:26:05 GMT LOG:  could not bind socket for statistics collector: Cannot assign requested address
2014-11-16 06:26:05 GMT LOG:  disabling statistics collector for lack of working socket
2014-11-16 06:26:05 GMT WARNING:  autovacuum not started because of misconfiguration
2014-11-16 06:26:05 GMT HINT:  Enable the "track_counts" option.
2014-11-16 06:26:06 GMT LOG:  database system was interrupted; last known up at 2014-11-15 07:33:35 GMT
2014-11-16 06:26:06 GMT LOG:  database system was not properly shut down; automatic recovery in progress
2014-11-16 06:26:06 GMT LOG:  record with zero length at 0/1D48850
2014-11-16 06:26:06 GMT LOG:  redo is not required
2014-11-16 06:26:06 GMT LOG:  database system is ready to accept connections

I could be wrong, but I still have this feeling the app is fine, it could be something DNS related. I tried a DNS trace of my app URL (http://notebook-davur.rhcloud.com) and the trace ends with this nugget:

Sending request to "ns3.p23.dynect.net" (208.78.71.23)
Received authoritative (AA) response:
-> Header: Non-Existent Domain

Solution

  • I was fairly sure this wasn't a code issue as the problem didn't start right after deploying any changes. To verify that the problem wasn't with the code, I created a new app and pushed the same code to it.

    $ rhc app create newappname python-2.7
    $ rhc cartridge add postgresql-9.2 -a newappname
    

    Take note of the Git URL returned after the app create command.

    $ git remote add newapptest GIT_URL
    $ git push newapptest --force
    $ open http://newapptest-mydomain.rhcloud.com
    

    Et voila! It worked.

    Since this was a mere hobby project with no important data in the database, I went ahead and deleted it through the web interface, and re ran the above commands replacing newappname with originalappname and instead of adding it as a new git remote I updated remote pointing to the now old-original/now-deleted app.

    $ git remote set-url openshift GIT_URL
    

    (Note: All the data in the database is deleted with the app, so if the data is valuable start with a pg_dump and end with a pg_restore)

    And my original URL is once again responding. I'm pretty sure it was something to do with Openshift's own settings, perhaps DNS or the like.