Virtualization for consultants

Virtualization has really taken off over the last couple of years.  One of the sites I work on is hosted on a virtual machine over at SliceHost (they’re excellent BTW).

A company I was with a few years back has switched their whole development environment to be fully automated VMs.  They can create a new clone of a VM and fire up a clean copy in a matter of minutes.

In my consulting work I’ve stumbled onto a problem that virtualization solves wonderfully.  The problem is this: small clients never have a development server for me to work on.  Many clients would prefer that I just develop directly on their production machine.  My position, however, is that I will never write/debug code on a production server, even if your site gets practically zero traffic.  It’s just a matter of priniciple.  What’s more, if you ever want to be a big site, you should make sure that your site isn’t down with errors all day long while people are coding up the next version.

What I did when I first started consulting was grab an old PC that I had snatched up at a surplus sale for $10 and loaded Linux on it.  When I started work with a new client I could usually configure a LAMP stack in less than an hour for the peculiarities of that client and then I’d be in business.  Of course, if I wanted to juggle multiple clients I had to use shell scripts to swap out the apache/php configs.

Then one day I decided to give virtualization a try.  It is absolutely fabulous!  I upgraded my little $10 machine with a 300 Gig hard drive, and set up a VM for each client on it.  Each client can have 10-20 gigs.  If I ever outgrow my 300 Gig drive it will be a no brainer to go grab a terabyte drive for whatever ridiculously cheap price they’re selling for and I’ll have plenty of room to grow.

Now if I want to switch from my VentureReturns VM to my HumanServicesHQ VM, I simply issue the following command.

xm shutdown vr -w;xm create hshq -c

Back on my desktop machine I just switch eclipse from one workspace to another and within about a minute I’m ready to work on an entirely different platform.  How cool is that!

Using “xargs” and “convert” to change image file formats

Suppose you have a directory full of .gif images that you want to convert to .jpg files. Here’s how how to combine a handful of command line tools to convert them all. No clicking, dragging, or context menus required. 🙂

Lets build this up from the ground up. We’ll need 4 components which I’ll detail below

  1. First, we need a list of files ending with .gif. That should be easy enough.
    ls *.gif
  2. Second, we need to remove the .gif extension from the filename so that we can later append .jpg to create the new filename. To do this we use sed.
    sed -e "s/.gif$//"

    This tells sed to read from stdin and on any line that ends with .gif remove the .gif.

  3. Third we need a way to run a program once for each file that we are modifying. That is where xargs comes in. It reads from stdin and runs whatever program you tell it to. It even builds the rest of the command line for you. A slightly useless example would be
    xargs -n1 echo

    which would run the “echo” command once for each line read from stdin. Echo would turn around an output the string again.

  4. The final piece of the puzzle is a program for converting between images formats. We’ll use the aptly named “convert”. All we do is pass in an input filename and an output filename. convert takes care of the rest.
    convert myimage.gif myimage.jpg

Now we combine all these peices through the magic of pipes and we can convert all the .gif files in a directory to .jpg files.

Here is the command:

ls *.gif | sed -e "s/.gif$//" | xargs -n1 --replace convert {}.gif {}.jpg

Just a couple more points of detail

  • ls *.gif will automatically output one filename per line when we use it in a “pipe” situation
  • The -n1 argument says that we want xargs to read one and only one line of input for each time it runs convert. Otherwise it would try to read all of its stdin which would confuse “convert” which will be looking for one input file per command line
  • The –replace argument to xargs says that we want to replace {} with the value read from stdin. If we didn’t do this then xargs would tack the line on as the final parameter when it ran convert. (Like the previous xargs example)

Setting your $PATH variable

Ever want to run a program and your shell doesn’t know where to find it? How does your shell know how to find some directories and not others? The short answer is that there is a variable named: $PATH that contains a list of directories to look in. This article will focus on setting the path variable in the bash shell.

The first thing you need to know about setting the path in Linux is that the technique for setting it is shell specific. We’ll be concentrating on the Bourne-Again-Shell (bash).

Every bash user can have two files in their directory that .bash looks for at certain times. The first is .bash_profile which runs once per login. The other is the .bashrc file that runs once per interactive shell. (One that isn’t using pipes.)

Your .bash_profile file is the appropriate place to set your path. But there is a little more to it to understand. So lets take a look at a simple .bash_profile file:

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
unset USERNAME

As you can see we set the PATH variable = $PATH:$HOME/bin. In other words take everything that is already in the path and tack on the “bin” subdirectory of the path defined by $HOME.

Why would we append to the path rather than just setting it? The reason is that other files that were included before this one have already set the path variable. You’ll note that this file has a strange “if [ -f ~/.bashrc] ” statement. That line (and the one after) mean if ~/.bashrc exists then run it.

So what is in my ~/.bashrc. Lets take a look:

# User specific aliases and functions

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

This file is where we would include any aliases we wanted to set up. Then we turn around and use the same trick we used before to run /etc/bashrc.

This whole files including files cycle goes on for a level or two more after this even. When it’s all said and done any script along the way could have added stuff to the path. That is why we use append to the path rather than setting it directly.

So lets show a real world example of setting the path variable. Sometimes /usr/local/bin isn’t part of your path. To set simply edit ~/.bash_profile. Somewhere in that file add the following:

PATH=$PATH:/usr/local/bin
export PATH

After that your can either type ” . ~/bash_profile ” or just log out and then log back in. Either way your path will be set from that point on

How to change users with “su” and have your path work correctly.

It is best to do all your day to day tasks as a user with the minimum privileges to accomplish those tasks. This prevents you from accidentally making drastic changes to your install or inadvertantly running a trojan horse that listens on a privileged port. But what happens when you want to su to root and none of your root commands are in your path. All the sudden you have to run locate or find just to run a program. Whats worse is that the user you were previously using has their path installed so if that account is compromised they could add a trojan horse to their path in the hopes that root would eventually run it.

For the solution, read on…

If you use

su -

You are telling the su command to give you a “login” shell which basically acts like you just logged in. It will create your path correctly, change to your home directory, and generally act like you weren’t logged in before. This is super convenient for all those utilities that live in /sbin and /usr/sbin that you probably don’t keep in the path of the user you use for your actual work.

The values in /proc

Ever noticed that nobody on the web documents the /proc filesystem very well. Well guess what. There is a man page for it.

I’ve always been extremely frustrated with the fact that /proc has thousands of values in there and I didn’t know what any of them meant. Well, it turns out that running

man proc

will actually give to the man page for the entire /proc file system. I know this eliminated one of my huge sources of frustration with Linux documentation.

Setting your hardware clock with a new date and time

The Linux system clock runs seperate from the actual clock hardware on the motherboard. Some programs may directly read from the hardware clock instead of the system clock which could lead to trouble. Here is how you can set and sync them.

First to set the hardware clock, you use the date command. date tries to be really intelligent about the date strings you feed it. The following was sufficient to set the date on my machine:

date --set "12:24am Dec 14,2003"

Once you set your system clock the only thing you have to worry about is what happens when you reboot. That is where the hardware clock comes in. It keeps track of the time when your machine is turned off. Your OS will pull the time from the hardware clock when it boots up. You can sink to/from the hardware clock at will with the commands below:

To set the hardware to match the system clock.

hwclock --systohc

To set the system clock to match the hardware clock.

hwclock --hctosys

Note:Thanks to Daniel Farinha for pointing out that the original article had –systohc and –hctosys backwards.