ruby-on-rails apache nginx reverse-proxy paradigms

Difference in web server paradigms (Apache vs. Reverse proxy + Web server)

I have starting developing with Ruby on Rails, and I have encountered what it has been described as a different paradigm when it comes to web servers.

Old paradigm (apache)
=====================

                +--- web process fork
                |
[requests] -----+--- web process fork
                |
                +--- web process fork


New paradigm (Puma + Nginx)
===========================
                                           +---> web app process 1 --> threads
                                           |
[requests] <-->  [reverse proxy server]  --+---> web app process 2 --> threads
                                           |
                                           +---> web app process 3 --> threads

On the article I was reading, it didn't try to explain the differences between this 2 paradigms, and the benefits of one over the other. This is what I am interested in.

What is the point of this new paradigm used on Ruby on Rails apps? What advantages has over the old HTTP daemon way? What are its disadvantages?

Solution

The application server architecture has the following traits:

Across the board, I’m in favor of running Web applications as app servers and reverse-proxying to them. It takes minimal effort to set this up, and the benefits are plenty: you can manage your web server and app separately, you can run as many or few app processes on as many machines as you want without needing more web servers, you can run the app as a different user with zero effort, you can switch web servers, you can take down the app without touching the web server, you can do seamless deployment by just switching where a fifo points, etc. Welding your application to your web server is absurd and there’s no good reason to do it any more.

compared to the classic model:

PHP is naturally tied to Apache. Running it separately, or with any other webserver, requires just as much mucking around (possibly more) as deploying any other language. php.ini applies to every PHP application run anywhere. There is only one php.ini file, and it applies globally; if you’re on a shared server and need to change it, or if you run two applications that need different settings, you’re out of luck; you have to apply the union of all necessary settings and pare them down from inside the apps themselves using ini_set or in Apache’s configuration file or in .htaccess. If you can. Also wow that is a lot of places you need to check to figure out how a setting is getting its value. Similarly, there is no easy way to “insulate” a PHP application and its dependencies from the rest of a system. Running two applications that require different versions of a library, or even PHP itself? Start by building a second copy of Apache. The “bunch of files” approach, besides making routing a huge pain in the ass, also means you have to carefully whitelist or blacklist what stuff is actually available, because your URL hierarchy is also your entire code tree. Configuration files and other “partials” need C-like guards to prevent them from being loaded directly. Version control noise (e.g., .svn) needs protecting. With mod_php, everything on your filesystem is a potential entry point; with an app server, there’s only one entry point, and only the URL controls whether it’s invoked. You can’t seamlessly upgrade a bunch of files that run CGI-style, unless you want crashes and undefined behavior as users hit your site halfway through the upgrade.

Other paradigms include:

The web application is a web server, and can accept HTTP requests directly. Examples of this model:
- Almost all Node.js and Meteor JS web applications (https://lookback.io).
- The Trac bug tracking software, running in its standalone server(https://trac.webkit.org).
The web application does not speak HTTP directly, but is connected directly to the web server through some communication adapter. CGI, FastCGI and SCGI are good examples of this. (web.py, flask, sinatra)

            start()
  -----------------------------
  |                           |
  | init()                    |
 NEW ->-- INITIALIZING        |
 | |           |              |     -------------------- STARTING_PREP -->- STARTING -->- STARTED -->---  |
 | |         |                                                            |  |
 | |destroy()|                                                            |  |
 | -->--------- STOPPING ------>----- STOPPED ----->-----
 |    \|/                               ^                     |  ^
 |     |               stop()           |                     |  |
 |     |       --------------------------                     |  |
 |     |       |                                              |  |
 |     |       |    destroy()                       destroy() |  |
 |     |    FAILED ---->------ DESTROYING --------------------    \|/                         |
 |                                 DESTROYED                     |
 |                                                               |
 |                            stop()                             |
 --->------------------------------>------------------------------

On Heroku, apps are completely self-contained and do not rely on runtime injection of a webserver into the execution environment to create a web-facing service. Each web process simply binds to a port, and listens for requests coming in on that port. The port to bind to is assigned by Heroku as the PORT environment variable.

References