Archives for the month of: November, 2008

Mongrel has an annoying ‘feature’ where it will bail out of startup if it finds a PID file already existing where it wants to write one. Without checking if the PID is actually stale or not, it will assume a server already exists and exit. Mongrel cluster has a –clean flag that will remove all existing PID files, but using Mongrel cluster makes it difficult to use SMF to keep all individual Mongrel instances up.

To get around this, we need to remove the PID file as part of the start script in our SMF manifest. e.g:

rm -f /path/to/pid/file

Unfortunately, chaining commands together with ‘;’ will cause SMF to think the script is exiting prematurely – rm will return 0 and exit, making SMF think the service has died. Instead we need to ‘hack’ it to work. Mongrel ignores anything on STDIN, so we can pipe the rm command to Mongrel, and all will work nicely:

rm -f /path/to/pid/file | /opt/local/bin/mongrel start ...

Capistrano is a great tool for running batches of shell commands on remote servers. It’s mostly used in the Rails community, but is helpful for deploying non-rails apps, or even running commands unrelated to web apps at all.

All our apps are run on Solaris servers and managed by SMF, and we use a series of manifests to control each individual thin instance (named thin/port4000…thin/port4014). Getting Capistrano to play with SMF (so we can restart the app servers with the new code) is luckily pretty simple. We could create our own spin and reaper scripts, but it’s nicer to keep the commands in the deploy script. We can tell Cap to run our commands like this:

namespace :deploy do
  task :start, :roles => :app do
    run "/usr/sbin/svcadm enable network/thin/*"
  end
end

The stop task is similar, but calling svcadm with disable instead. The majority of the time we will be deploying a new version of the app to production, and it’s the restart task that gets called. Restart can be composed of the stop and start tasks:

namespace :deploy do
  task :restart, :roles => :app do
    stop
    start
  end
end

 

By default, SMF only allows the root user to control SMF managed processes. For security we disable root logins over ssh, so we need a way for lower privileged users to control our process instances. Role Based Access Control (RBAC) is a clean way of allowing a user to perform specific tasks. To begin, we need to add a new authorisation. Add this line to the end of /etc/security/auth_attr:

solaris.smf.manage.thin:::Manage Thin::

This is simply to tell Solaris there is a new authorisation – we could have named it anything. To be able to control SMF using this authorisation, we need to apply it the instances we want to control. There are two types of authorisation we need to apply to each instance: the action authorisation, and the value authorisation. Action is used to perform an action on an instance (e.g restart), value is used to change a value on an instance (e.g set the enabled/disabled state). We need to do this once for each thin instance we are controlling:

svccfg -s network/thin/port4000 setprop general/value_authorization=astring: 'solaris.smf.manage.thin'
svccfg -s network/thin/port4000 setprop general/action_authorization=astring: 'solaris.smf.manage.thin'

Finally we need to apply this authorisation to the user we want to give control to:

usermod -A solaris.smf.manage.thin admin

Admin will now be able to completely start and stop our thin instances now:

svcadm restart network/thin/port4000
svcadm enable network/thin/port4000
svcadm disable network/thin/port4000
svcadm refresh network/thin/port4000 

Solaris can sometimes seem a little foreign to people used to Linux or OS X, but its network performance and admin tools make it an excellent platform for hosting web apps. We decided on using nginx as our static file server, and load balancer, and thin as our rails app server. I won’t go in to details of getting these all up and running on Solaris, there’s plenty of resources already available for that.

 

nginx
Depending on the way you compile nginx, it may already default to using Solaris eventports. Event port is Solaris’ fast asynchronous blocking IO call, similar to epoll in Linux and kqueue in BSD/OS X. To ensure you use it, add a definition to events:

events {
  worker_connections  2048;
  use eventport;
}

One of the great admin tools in Solaris is SMF (Service Management Facility). It’s similar to launchd on OS X – built in to the kernel, it keeps your app up and running. Unlike monit there’s no lag time between checks on the process; the instant a process dies SMF restarts it. You (unfortunately) use XML files called manifests to tell SMF what to watch, and how to start/stop/refresh it. You can also define dependencies so your app doesn’t start up before network services are ready, for instance. This is the manifest we use for nginx:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='nginx'>
  <service name='network/nginx' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>

    <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/local'/>
    </dependency>

    <dependency name='net' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/network/loopback'/>
    </dependency>

    <exec_method name='start' type='method' exec='/usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf' timeout_seconds='60'>
      <method_context working_directory='/usr/local/nginx/'>
        <method_credential user='root' group='root'/>
      </method_context>
    </exec_method>

    <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>
      <method_context/>
    </exec_method>

    <exec_method name='refresh' type='method' exec='/bin/kill -HUP `cat /usr/local/nginx/logs/nginx.pid`' timeout_seconds='60'>
      <method_context/>
    </exec_method>
  </service>
</service_bundle>
What we’re saying is:  

  • There is a single instance of this process
  • Wait until the file system is started
  • Wait until the network is started
  • Define the start method (command line that’s run to start the service). We also set the CWD and current user.
  • Define the stop method (using the special :kill helper)
  • Define how to refresh the nginx config (send nginx a HUP signal)

Save this and import it into your system manifest with:

svccfg import nginx.xml

This will check the validity of the XML file, import it, and start the service.

 

thin
There’s no config files required for thin, instead we supply the config as flags to the process. The SMF manifest we use for thin places the multiple thin processes in their own “thin” hierarchy under network/thin.

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='thin/port4000'>
  <service name='network/thin/port4000' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>

    <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/local'/>
    </dependency>
    <dependency name='net' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/network/loopback'/>
    </dependency>
    <dependent name='thin_port4000' restart_on='none' grouping='optional_all'>
      <service_fmri value='svc:/milestone/multi-user'/>
    </dependent>

    <exec_method name='start' type='method' exec='/usr/local/bin/thin -e production -d -p 4000 -c /app/current -l /app/current/log/production.log -P /app/current/tmp/pids/thin4000.pid start' timeout_seconds='60'>
      <method_context working_directory='/var/log'>
        <method_credential user='root' group='root'/>
        <method_environment>
          <envvar name='PATH' value='/usr/bin:/bin:/usr/local/bin'/>
        </method_environment>
      </method_context>
    </exec_method>

    <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>
      <method_context/>
    </exec_method>
  </service>
</service_bundle>

One of these manifests is used per thin instance we want to keep up. So, for instance, if we had a second thin server running on port 4001, we would make a copy of this manifest and change the appropriate port numbers throughout the file (both for the name of the service, and in the start method). This way SMF will only restart a single thin process at a time if any fail.

By placing the processes under their own hierarchy, we can reference all processes with one pattern. e.g to restart every thin server, we can issue one command, regardless of how many thin processes we have set up:

svcadm restart `svcs -H -o FMRI network/thin/*`