Saturday, June 28, 2008

The Oldest API

If you ask anyone about the Unix API, you'll get a lot of different answers. Some people will talk about the original Unix syscalls. Some will talk about which commands you expect to find on the system. Others might talk of POSIX. Still others might talk about which libc calls are most portable. However, there's a basic interface so fundamental to making Unix what it is that most people forget that it even exists.

I'm talking about the "process" API. When you run a Unix command, it generally has three magic "files" open. Represented by file-descriptors 0, 1, and 2, the "files" are called stdin, stdout, and stderr respectively; and they're really data streams that may refer to other processes or devices like the terminal or a serial port. Not surprisingly, the first one is read for input, the second one is written for output, and the third one is written for "error information". Finally, when the process exits, you get a "return code". An amazing amount can be done with just these simple tools.  Just the spawn(), read(), write(), and exit().  Add in setenv() if you care to pass information in the environment, too.

Why I am posting this nonsense on an Engine Yard-themed blog? In dealing with some internal engineering issues, I was struck by the elegance of components that use this API to great effect. In recognizing them, I realized that very few people probably even realize how simple it is to extend them using this relatively ancient API.

First, take a look at Nagios.  It's a slightly obtuse but fairly commonly used monitoring system.  For most small installations, you can quickly generate custom monitoring of your infrastructure.  Part of why it is so powerful is that it comes with a suite of fairly flexible plug-ins that do the heavy lifting of the monitoring.  What is slightly less well known is that these plug-ins use the process API.  With the simple application of a Ruby script (or shell, Python, Java, C, etc.), you can write a plug-in to monitor whatever you want.

How do you use this API, you ask?  Simply do whatever you need to do for check, then print out a line and exit with the appropriate return code.  The return codes are:
  • exit with return code zero (OK)
  • exit with return code one (WARN)
  • exit with return code two (CRIT)
  • exit with return code three (UNKNOWN)
The line of text has a format that encodes enough data that most graphing utilities can create some impressive graphs.  You simply output something like "OK - nuclear reactor is fine | temp=500 F;800;1000;0;1500, pressure=6000 kPa;10000;10800;0;12000".  This little bit of text gives two measurements, their names, their units of measure, the warning/critical threshold for each, and the range of each.

Another use of this simple API is found in RedHat's Clustering Suite (RHCS).  RHCS keeps track of which nodes are running.  These nodes lock appropriate resources to do their work.  In CLVM, they lock the clustered volume metadata.  In GFS, they lock blocks of the filesystem.  In csnap and cmirror, they lock blocks of a block device.  In all cases, these locks are critical to keep data from being trashed on your SAN.

When one of these nodes fails, the system must free the locks that the dead node held.  When those locks are freed, the old node no longer has permission to work with the resource it locked.  If that node were to wake up and keep going about its business (since it thinks it has the lock), then it might trash whatever data is represented by that resource.

To prevent this, the cluster "fences" the node.  The idea is that it puts it into a sort of virtual "penalty box" (i.e. behind a fence) that prevents this from happening.  As you might imagine, this is critical to the safety of data in a cluster.  Also important, every clustered infrastructure will have to do this differently.  Thus it is critical that it be as easy as possible to plug-in your own fencing agents (at least critical for the adoption of RHCS).

To write a fencing agent, all you have to do is write a program that reads / writes from stdin / stdout and returns a sane exit code.  Sound familiar?  Again, the most fundamental API in Unix rears it's venerable head.  The details are all in their wiki.  Using this interface, RHCS comes with agents that will allow manual fencing (for testing), fencing various SANs at the SAN itself, fencing machines by resetting them at programmable power switches, fencing virtual machines by talking to their control infrastructure, or anything else you can implement.

This was exactly was what was necessary here at Engine Yard, so this simple API came in handy.  It's a simple as it is powerful, and often, it's all that you need.

1 comment:

tsykoduk said...

honestly, simple solutions should be adopted over less simple ones, all other things being equal. I think that a lot of technologists loose sight of this simple rule, and are overcome with almost a lust for more complex systems when there is really no need. I think that the classic example is the registry in windows. A few simple text files would be a lot simpler solution, and you would get at least 90% of the functionality out of them. However, someone at MS was enamored with the idea, and presto. Most complex configuration system ever.

Anyways, here's to simple solutions to complex problems, and reusing existing tools which get the job done!