Apache Best Practice Deployment
By Charles Brian Quinn
The preferred setup (for now) is to put Mongrel behind an Apache 2.2.x server running mod_proxy_balancer. Apache is a proven web server, runs half the Internet, and is a pain to configure. These instructions should get you started, but refer to the Apache folks for anything more complex or weird.
When you're just starting out, don't bother with doing anything but running just Mongrel. Mongrel is slower than Apache, but not so slow that small installations will notice it. The worst thing you can do is try to learn Apache configuration when you're also trying to learn Ruby on Rails and Mongrel too. Start small, then *when you need*, build up to the big stuff.
A simple single mongrel configuration
Start up a single mongrel instance on port 8000:
$ mongrel_rails start -d -p 8000 \ -e production -P /full/path/to/log/mongrel-1.pid
Now, we'll tell Apache to simply proxy all requests to the mongrel server running on port 8000. Simply add the following to your httpd.conf or in a vhost.conf file:
# Once you've turned mod_proxy on in your environment, very important
# to avoid inadvertantly being an open forward proxy!
ProxyRequests off
<VirtualHost *:80>
ServerName myapp.com
ServerAlias www.myapp.com
ProxyPass / http://www.myapp.com:8000/
ProxyPassReverse / http://www.myapp.com:8000
ProxyPreserveHost on
</VirtualHost>
That's it, in a nutshell. Several things to note in this configuration:
1) This configuration forwards all traffic to mongrel. This means mongrel will serve images, javascript, files, and everything else. It's quite fast at this, but Apache can do it better.
Here are some basic proxypass rules you can add to tell the ProxyPass? not to forward on requests to certain documents/requests:
ProxyPass /images ! ProxyPass /stylesheets ! #continue with other static files that should be served by apache Alias /images /path/to/public/images Alias /stylesheets /path/to/public/stylesheets #continue with aliases for static content
For a more detailed set of rules for forwarding on all dynamic content to mongrel, see the more detailed configuration below for more details.
2) In this configuration, it is entirely possible that two users (web requests) could hit your application at the exact same time, and one would have to wait literally milliseconds until the first request is finished before having a turn at the mongrel instance. Unless you've got some really long HTTP processes, the nature of the HTTP protocol is pretty good at waiting in line. Only you can determine through metrics how long and how many users will come at your application at the exact same time.
Sufficient to say, if you're ready to start scaling with multiple mongrel instances, read on.
Using multiple mongrel instances with mod_proxy_balancer
First, let's start up a few mongrel instances (linux/freesd):
$ mongrel_rails start -d -p 8001 \ -e production -P log/mongrel-1.pid $ mongrel_rails start -d -p 8002 \ -e production -P log/mongrel-2.pid $ mongrel_rails start -d -p 8003 \ -e production -P log/mongrel-3.pid $ mongrel_rails start -d -p 8004 \ -e production -P log/mongrel-4.pid
You can also use mongrel_cluster by Bradley Taylor for managing several mongrel instances with a configuration file (and sysv init scripts for -nix-flavor servers).
We're going to be requiring the use of mod_proxy_balancer, a new feature in Apache 2.1/2.2 and above to proxy requests to our mongrel instances. This software based HTTP load balancer will distribute requests evenly (applying a weighting and selection algorithm) to our mongrel instance(s). It even comes with a swell load-balancing manager page for monitoring incoming requests. For more information, see: Apache's mod_proxy_balancer Documentation.
Obtaining Apache 2(.1+)
I won't go into too many details, as windows and the various linux distributions all have several methods for obtaining apache2, but you will need the use of the following modules:
- mod_proxy, mod_proxy-html, and mod_proxy_balancer
- mod_rewrite
- mod_deflate
- mod_headers
- (optional) mod_cache and one of mod_memcache or mod_filecache
- (optional) mod_ssl
If you're compiling from source, this configuration should do the trick:
#./configure --enable-deflate --enable-proxy --enable-proxy-html \ --enable-proxy-balancer --enable-rewrite --enable-cache \ --enable-mem-cache --enable-ssl --enable-headers
Note: If you're going to be serving only Mongrel instances (Mongrel serving up Ruby on Rails or any other ActiveRecord? containing framework), some have noted better performance and stability using the MPM worker class instead of the pre-fork. If you don't know what this means, it's safe to ignore.
Essentially, in the default pre-fork worker mode, Apache will spawn several processes when it starts up (pre-forking) and will spawn more if more requests come in that need to be handled. On a heavily trafficked, very dynamic (not much cached content/assets) Rails site, if you are doing nothing but servicing Rails, it doesn't make sense to spawn 20 apache processes, in front of 3 Mongrel processes, as Mongrel will be queuing them up, anyways.
Configuring Apache2
A good practice is the separation of apache configuration files. Recommended by several other good guides, we'll be storing information for our application in several different files. Put these files somewhere that apache2 knows about. Apache is quite good about scanning for all .conf files in certain directories.
myapp.common
Apache lets you include common configuration items into another configuration so you can cut down on repetition. What we're going to do is make a file that has all the common junk that every Mongrel application needs to work at all, then we'll just include this in little .conf files for any application we deploy.
Notice that this file doesn't end in .conf since it's not a real configuration file, but you can name it however you wish.
Important Update: typo fixed in IE deflate rules Important Update: fixed bug with existence of /system directory on OSX Solaris
ServerName myapp.com
DocumentRoot /var/www/myapp.com/current/public
<Directory "/var/www/myapp.com/current/public">
Options FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
RewriteEngine On
# Uncomment for rewrite debugging
#RewriteLog logs/myapp_rewrite_log
#RewriteLogLevel 9
# Check for maintenance file and redirect all requests
# ( this is for use with Capistrano's disable_web task )
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ %{DOCUMENT_ROOT}/system/maintenance.html [L]
# Rewrite index to check for static
RewriteRule ^/$ /index.html [QSA]
# Rewrite to check for Rails cached page
RewriteRule ^([^.]+)$ $1.html [QSA]
# Redirect all non-static requests to cluster
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://mongrel_cluster%{REQUEST_URI} [P,QSA,L]
# Deflate
AddOutputFilterByType DEFLATE text/html text/plain text/css
# ... text/xml application/xml application/xhtml+xml text/javascript
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4.0[678] no-gzip
BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
# Uncomment for deflate debugging
#DeflateFilterNote Input input_info
#DeflateFilterNote Output output_info
#DeflateFilterNote Ratio ratio_info
#LogFormat '"%r" %{output_info}n/%{input_info}n (%{ratio_info}n%%)' deflate
#CustomLog logs/myapp_deflate_log deflate
myapp.conf
We then take the above commmon file and include it in our configuration file for this application deployment.
If you're using virtual hosting (a pretty good idea, even when you're the only one on the server), your sample configuration can be this simple:
<VirtualHost *:80>
Include /etc/httpd/conf.d/myapp.common
ErrorLog logs/myapp_errors_log
CustomLog logs/myapp_log combined
</VirtualHost>
myapp.proxy_cluster.conf
This is the meat of our configuration, and goes hand in hand with our mongrel (or mongrel_cluster) configuration. This configuration tells the apache2 mod_proxy_balancer to proxy requests to 3 mongrel instances running on ports 8000, 8001, and 8002.
<Proxy balancer://mongrel_cluster>
BalancerMember http://127.0.0.1:8000
BalancerMember http://127.0.0.1:8001
BalancerMember http://127.0.0.1:8002
</Proxy>
If you had an seperate application server, you could balance to it easily by replacing the 127.0.0.1 with the ip or hostname of your application server, but be sure to make them listen on an external interface (rather than 127.0.0.1).
When you add an additional mongrel to your mongrel_cluster, you can simply add an additional BalancerMember? to this file, restart apache (or reload) and you're all set.
(optional) myapp.proxy_frontend.conf
This optional file will setup the balancer-manager -- a simple front-end for viewing how your requests are being handled. This balancer in the configuration below will only work from the localhost, so no one else (or possibly you) can view it unless you alter the "Deny" and "Allow" lines.
Listen 8080
<VirtualHost *:8080>
<Location />
SetHandler balancer-manager
Deny from all
Allow from localhost
</Location>
</VirtualHost>
SSL Requirements
In order for mongrel to know that this request has a forwarded protocol of https, we'll need to add a special header (hence the addition of mod_header, included in most apache2 builds).
Include /etc/httpd/conf.d/myapp.common # This is required to convince Rails (via mod_proxy_balancer) that we're # actually using HTTPS. RequestHeader set X_FORWARDED_PROTO 'https'
You need this mostly so that redirects go back to https and so you can spot when people are coming through SSL or not.
Automation, Automation, Automation
There are several great tools that automate the setup of Apache for use with mongrel and mongrel_cluster. The RailsMachine gem can automate an entire setup of a Rails application. Also, Slingshot Hosting has a sample set of Capistrano recipes that automatically setup Apache2 and mongrel through the rake remote:setup task. Be sure to check out both for some ideas.
Running Multiple Rails Apps with Mongrel
The newest version of Mongrel supports multiple Rails applications through the use of the --prefix command. The Apache magic for proxying a single application is here assuming your prefix is app1:
ProxyPass /app1 http://127.0.0.1:3000/app1 ProxyPassReverse /app1 http://127.0.0.1:3000/app1
You need to have the proxy pass the new directory name.
Thanks to Joey Geiger and others of the mongrel list for these instructions.
Success Stories
Martins on the mongrel-list has submitted this simple apache configuration. It serves up static content with apache, and forwards dynamic content on to mongrel using ProxyPass?. Thanks Martins:
<VirtualHost *>
ServerName myapp.tld
ServerAlias www.myapp.tld
DocumentRoot /var/www/sites/myapp/current/public
<Directory "/var/www/sites/myapp/current/public">
Options FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
RewriteEngine On
# Check for maintenance file. Let apache load it if it exists
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteRule . /system/maintenance.html [L]
# Let apache serve static files
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f
RewriteRule (.*) $1 [L]
# Don't do forward proxying
ProxyRequests Off
# Enable reverse proxying
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
# Pass other requests to mongrel instance
ProxyPass / http://127.0.0.1:8200/
ProxyPassReverse / http://127.0.0.1:8200/
</VirtualHost>
Phillip Hallstrom has submitted this apache configuration, which includes support for having static directories handled by Apache, php support, and hiding svn directories.
<VirtualHost *:80>
ServerName myserver.com
DocumentRoot /path/to/my/app/public
<Directory "/path/to/my/app/public">
Options FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
<Proxy balancer://mongrel_cluster>
BalancerMember http://127.0.0.1:8805
</Proxy>
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ $1/ [R]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} \.php
RewriteRule ^(.*)$ $1 [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME}/index.html -f
RewriteRule ^(.*)$ $1/index.html [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME}/index.php -f
RewriteRule ^(.*)$ $1/index.php [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} -d
RewriteRule ^(.*)[^/]$ $1/ [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://mongrel_cluster%{REQUEST_URI} [P,QSA,L]
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE application/x-javascript
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
BrowserMatch ^Mozilla/4 gzip-only-text/html
BrowserMatch ^Mozilla/4.0[678] no-gzip
BrowserMatch bMSIE !no-gzip !gzip-only-text/html
php_value include_path /path/to/my/app/php:/usr/local/lib/php:.
php_value auto_prepend_file /path/to/my/app/php/auto_prepend.php
# this not only blocks access to .svn directories, but makes it appear
# as though they aren't even there, not just that they are forbidden
<DirectoryMatch "^/.*/\.svn/">
ErrorDocument 403 /404.html
Order allow,deny
Deny from all
Satisfy All
</DirectoryMatch>
</VirtualHost>
Jens Kraemer reports this differing proxy setup that uses the P option in Rewrite rules so as not to use the ProxyPass? directive:
# Don't do forward proxying
ProxyRequests Off
# Enable reverse proxying
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
RewriteEngine On
# Check for maintenance file. Let apache load it if it exists
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteRule . /system/maintenance.html [L]
# Rewrite index to check for static
RewriteRule ^/$ /index.html [QSA]
# Let apache serve static files (send everything via mod_proxy that
# is *no* static file (!-f)
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
RewriteRule .* http://127.0.0.1:8200%{REQUEST_URI} [L,P,QSA]
the P option to the last rule replaces the ProxyPass? and ProxyPassReverse? directives.
SVN Security
If you use svn to issue checkouts instead of exports, you'll need to hide those pesky .svn directories. This works:
# this not only blocks access to .svn directories, but makes it appear
# as though they aren't even there, not just that they are forbidden
<DirectoryMatch "^/.*/\.svn/">
ErrorDocument 403 /404.html
Order allow,deny
Deny from all
Satisfy All
</DirectoryMatch>
</VirtualHost>
Sending Environment variables to mongrel through proxy
Jon Reads reports successfully reading the REMOTE_USER variable:
After many hours trying to solve the same problem I found this post: Forcing a proxied host to generate REMOTE_USER
and can confirm that the following works for me when put in the Proxy directive on Apache 2:
RewriteEngine On
RewriteCond %{LA-U:REMOTE_USER} (.+)
RewriteRule . - [E=RU:%1]
RequestHeader add X-Forwarded-User %{RU}e
Update
Satya reports that this works better:
RewriteEngine On
RewriteCond %{IS_SUBREQ} ^false$
RewriteCond %{LA-U:REMOTE_USER} (.+)
RewriteRule . - [E=RU:%1]
RequestHeader add Remote-User %{RU}e
His explanation:
Note the first RewriteCond?. The LA-U in the 2nd RewriteCond? causes an
internal subrequest, which causes inf recursion inside apache. Apache
eventually catches it, but it does bog down the server (and crashed our
shib, but that's not your problem). I think the 1st RewriteCond? fixes
it.
Peer Allen reports that you can send any environment variable through to mongrel:
Here is the Apache config I used to forward the GEOIP_COUNTRY_CODE from the
Maxmind mod_geoip module. It is basically the same as the REMOTE_USER
forwarding, but since the GEOIP variable is an environment variable in
Apache you have to access it differently in the RewriteCond? with the "ENV"
prefix. See the mod_rewrite documentation for this:
2. %\{ENV:variable}, where variable can be any environment variable, is
also available. This is looked-up via internal Apache structures and (if not
found there) via getenv() from the Apache server process.
http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html#RewriteRule
# Forward the GEOIP_COUNTRY_CODE
RewriteCond %\{ENV:GEOIP_COUNTRY_CODE} (.+)
RewriteRule . - [E=RU:%1]
RequestHeader add X-Forwarded-GeoIP %{RU}e
Caveats
Jason Hoffman reports:
Apache's mod_proxy_balancer module is a fully blocking module and with
the default httpd.conf you're going to max out in the 120-160 requests/
second range on a decent box. You can tune up its proxying to about a
1000 req/sec.
So yes the net result is that you can really only put a couple of
mongrels behind apache's proxy engine (about 2 "hello world" rails
mongrels).
John Dewey wrote:
I noticed under the Apache 2.2 configuration there were maintenance rewrite rules:
# Check for maintenance file and redirect all requests
# ( this is for use with Capistrano's disable_web task )
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]
Apache returns a 403 when the maintenance file exists. After some tracing the problem exists when a /system/ directory exists on the machine. So OS X and Solaris run into this problem.
I have a corrected rewrite rule that should fix the problem:
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ %{DOCUMENT_ROOT}/system/maintenance.html [L]
I posted further details here: http://discuss.joyent.com/viewtopic.php?id=14452
The documentation has been updated to reflect this.
References and Other Guides
[1] Time For A Grown-Up Server: Rails, Mongrel, Apache, Capistrano and You
[2] Bradley Taylor's Fluxura and RailsMachine
[3] Slingshot Hosting Automated Capistrano Recipe
Thanks to many users on the mongrel list for making it easy for me to compile all these tips and tricks as they come across the list.
