Archive for the ‘Servers’ Category

“ShellShock” … a truly stunning example of an ill-considered feature.

For those who live under a rock, or weren’t paying attention, the so-called ShellShock bug as stated by most is that if you create an environment variable in the form: name='() { :; } ; command’ and start Bash, command will be executed unconditionally when Bash starts. Which isn’t normally a problem, but if Bash is the default shell, and (say) a web script executes a system() call to run a system command, it’s going to run Bash. And since CGI scripts (and things that behave like CGI) put things they got from the original web client’s HTTP headers, that basically provides a means of running whatever you want in the context of the web application. Ugly.

Of course there are now patches, now that the white hats know about the problem, although how long the black hats have known and were exploiting it, no-one can say.

So let’s look at the problem in detail. (If you aren’t familiar about Unix-like OSes and shell programming you can stop reading now).

Bash has a feature that allows a function to be exported in the environment and imported from the environment. For example,

$ foo() { echo i am foo ; }        # Define a function foo
$ foo                              # Execute it
i am foo
$ bash                             # Start a subshell
$ foo                              # foo is not defined in the subshell
bash: foo: command not found
$ exit                             # Return to the outer level
$ export -f foo                    # export foo to the environment
$ bash                             # Start another subshell
$ foo                              # foo is now available to the subshell
i am foo

Now the mechanism that Bash uses to implement this feature is simple. Too simple. Internally, Bash maintains separate tables of variables and functions. On starting, it imports the environment into the list of variables. This is true of all Bourne-compatible shells like Bash. But Bash has a couple of special cases, one of which is that it can place functions into the environment too. (It doesn’t by default; you have to use export -f function-name to do this.)

The environment is pretty straightforward; it is simply a list of strings in the form name=value. So how does Bash store and retrieve a function?

It’s simple. Too simple. It looks for the string “() {” (that is, open paren, close paren, space, open curly). In our example, foo() is exported as “foo=() { echo i am foo ; }“. When Bash starts, it recognises the “() {“, rewrites the line as “foo() { echo i am foo ; }“, and hands it straight to its command interpreter for execution, just as if it had been entered like the first line of the example.

Prior to the patches coming out, that’s all it did. It didn’t check to see if the definition had anything after the closing curly bracket. So if you put anything in the environment that looked like “function-name='() { function-definition } ; other-commands“, other-commands would be unconditionally run. The patches attempt to stop other-commands from being executed.

As I write this, most patches out there are flawed, because there are other things that can go badly awry with this. And that’s not a surprise, because the basic action is still, fundamentally, hand a piece of arbitrary text, of unknown source, to the command interpreter. How could that possibly go wrong?

Let’s step back a bit here. The environment is a place for programs to put bits of data for programs running in sub-processes to pick up. Usually, this is benign; the sub-processes generally only look for variables they want, and can take or leave the data. There are of course many examples of shell scripts executing environment variables as shell code, because they haven’t quoted the expansions properly, but generally, you can write secure shell script.

But Bash’s function export/import feature fundamentally changes that model. It allows the code that the script is executing to be changed by data inherited from outside its control, and before the script takes control.

For example, let’s just assume that all the patches to Bash work, and the functionality is reduced to only ever allowing a function to be imported, and never having any other nasty side effect. I can still do this:

$ cd() { echo I am a bad man ; }  # Redefine the cd shell builtin
$ export -f cd                    # Export it
$ cat x.sh                        # x.sh is just a script that does a cd
#!/bin/bash
cd /home
$ ./x.sh                          # And run the script
I am a bad man

The implications? If I can control the environment, I can control the operation the commands executed by any Bash script run from my session, including, for example, any script launched by a privileged program. And if /bin/sh is linked to Bash is the default shell, any shell command launched via a system() call is also a “bash script”, since system(“command“) simply spawns a sub-process, and in it, executes /bin/sh -c command.

When I look at the function import feature of Bash, my reaction is, why the hell did anyone think this was a good idea?

I’m usually not keen on removing features. In my experience, if you think nobody would do it that way, you’re probably wrong (see don’s law). But this one is just bad for so many reasons it’s ridiculous. It’s not needed for /bin/sh compatibility. As far as I can make out, it’s rarely if ever used at all. So if there’s a candidate for a featurectomy, this is it. (If you want to do this, the offending code is in the function initialize_shell_variables() in the file variables.c of the Bash source code at ftp.gnu.org/gnu/bash/.)

Or perhaps we should all just do what FreeBSD and Debian Linux have already done, and use a smaller, lighter shell (such as Dash) for shell scripts (installed/linked as /bin/sh), and relegate Bash to interactive command interpreter duties only.

Band-aid patching around this bug without removing the underlying issue – that Bash imports code from an untrusted source – is only addressing part of the problem.


Edit: There are of course now patches in play which do a few things; the band-aids referred to above, and a new one to move the exported functions into environment variables named BASH_FUNC_functionname. I’m not sure that the latter significantly improves security of the “feature”.

However, there is one way to deal with commands being passed to /bin/sh. Bash recognises when it is executed as “sh”, and makes some assumptions. This patch (to Bash 4.3 patch 27) makes Bash refuse to import functions when executed as “sh”. The advantage of this is that commands invoked from system(), and scripts that specify their interpreter as “#!/bin/sh” (and therefore should not expect Bash-isms to be present) will not be vulnerable to any abuse of the function export/import feature.

Don’t get me wrong, I am still advocating a complete featurectomy. But this might be more acceptable to those who think importing random functions from who knows where is somehow a good idea…

Seriously.  They don’t like it.  They sulk.

Brendan Gregg of the Sun Microsystems Fishworks engineering team, has written up this effect, with video, at http://blogs.sun.com/brendan/entry/unusual_disk_latency

Moreover, don’t vibrate your drives.  Why an I saying this?

Because, three months ago we took delivery of three 1U pizza boxes. They’re small Supermicro boxes, with room for a normal ATX motherboard and a hard drive.  We equipped these with terabyte drives, fairly normal Supermicro motherboards, 3 GHz Core2 Duo CPUs and 8GB memory each.

They just didn’t run right.  Occasionally, one wouldn’t even make it through an OS install, and the ones that did wouldn’t put through as much work as a much lower spec machine.

We suspected the drives; we suspected the power supply.  Actually, we really thought it was the power supply, but even though the PSUs on these chassis were small, and the 12V rails seems to be running slightly low, at 11.85V, no amount of bashing the numbers suggested that the systems were actually underpowered.

The first breakthrough was running “hdparm -t –direct /dev/sda” on the drive, which showed wildly fluctuating numbers, consistent with the behaviour we were seeing.  So it was something to do with the disk subsystem.

The next breakthrough was when we discovered that if we unplugged the chassis fan (an ugly centrigufal thing) from the motherboard, the problem went away.  The hdparm numbers stabilised at 100MB/s or more.

We saw small changes in power supply volts when we did this, so we were still suspecting the power supply.  I put an ammeter on the fan power line, to see how much power the fan was pulling.  1.2A at full speed.

We played with the fan speed in the BIOS; at its lowest speed, it would pull 0.25A, and the drive would perform well; at the “server” setting, with the server otherwise unloaded, it would pull about 0.6A.  At that rate, it was starting to have an effect on performance.

This was a PSU that was supposed to be able to deliver 18A on the 12V rail, and 260W total.  I really couldn’t see how the 12V would be at the edge when the PSU was pulling less than 100W (measured at the AC feed) and was running three fans and a hard drive and a few minor bits and pieces like the serial port and network interface, all of which should have summed to maybe 5A.  The numbers didn’t add up.

Finally, I had a brainwave.  I removed the fan from the chassis, still running.  The problem went away.  I touched the fan to the drive.  The drive throughput dropped through the floor.

After a few more experiments, the conclusion is that with the fan mounted close to the drive, the vibrations were enough to upset the performance of the drive, consistently.  Two different terabyte drives (one Seagate, one Western Digital) exhibited the same problem.

I duplicated this by applying abnormal vibration to the case of my desktop PC (half terabyte Seagate), and even the grottly little thing I have at home (a Seagate 160GB PATA drive).

Conclusion: all modern drives are subject to potentially serious performance issues when faced with abnormal vibration.  The Supermicro chassis exacerbated the problem  because of the placement of the fan with respect to the drive, and the fact the drive is mounted directly to the chassis.  Also, the placement of cables up against the fan meant that vibrations were being transferred directly through the connectors from the fan; somthing that could be partially alleviated by re-routing the power cable under the fan.

The fact that right angle SATA power connectors are so darned hard to get made this more of an issue than it should have been.

I think a bit of judicious use of closed-cell foam packing, turning the fan speed down, and re-routing cables away from the fan will finally solve the problem.

Hopefully.