Facebook’s Hiphop

I spent several hours Saturday poking around in Facebook’s new Hiphop PHP source code compiler. I haven’t successfully built the program yet, but after taking some time to review the source code I’m very optimistic.

I would have a build already, but I tried to build the tool on a virtual server with 256Megs of ram. The only problem is that one of the source code files is 1.8 MB of text. The compiler footprint would balloon up to 800 Megs and swap thrash for a while before ultimately running out of memory when it tried to compile it. I’m going to build a 2Gig VM soon and that should be plenty of RAM to get it to build successfully.

Avoiding javascript injection

The cardinal rule of web development is never trust user supplied data to be safe.  A surprising number of developers don’t take this seriously when inserting into a database.  An even larger group incorrectly trust their raw data for output.  This opens upon the browser to what are called injection attacks.

Injection attacks open up your web application to malicious users who can use it to get your application to output things you never intended it to, like a block of javascript that passes the session id to a remote server.  The solution is to always convert your data into a benign form before outputting.  With database queries this means adding slashes to both quotes and slash characters inside of your string variables.  In HTML this means converting dangerous characters into html entities.  (Those little < &gt, & things you’ll see all over the source for the better web sites.)

Usually following these two techniques religiously is enough to secure your application from injection attacks.  However, I ran into an interesting problem the other day that requires a third type of escaping.

Continue reading “Avoiding javascript injection”

Virtualization for consultants

Virtualization has really taken off over the last couple of years.  One of the sites I work on is hosted on a virtual machine over at SliceHost (they’re excellent BTW).

A company I was with a few years back has switched their whole development environment to be fully automated VMs.  They can create a new clone of a VM and fire up a clean copy in a matter of minutes.

In my consulting work I’ve stumbled onto a problem that virtualization solves wonderfully.  The problem is this: small clients never have a development server for me to work on.  Many clients would prefer that I just develop directly on their production machine.  My position, however, is that I will never write/debug code on a production server, even if your site gets practically zero traffic.  It’s just a matter of priniciple.  What’s more, if you ever want to be a big site, you should make sure that your site isn’t down with errors all day long while people are coding up the next version.

What I did when I first started consulting was grab an old PC that I had snatched up at a surplus sale for $10 and loaded Linux on it.  When I started work with a new client I could usually configure a LAMP stack in less than an hour for the peculiarities of that client and then I’d be in business.  Of course, if I wanted to juggle multiple clients I had to use shell scripts to swap out the apache/php configs.

Then one day I decided to give virtualization a try.  It is absolutely fabulous!  I upgraded my little $10 machine with a 300 Gig hard drive, and set up a VM for each client on it.  Each client can have 10-20 gigs.  If I ever outgrow my 300 Gig drive it will be a no brainer to go grab a terabyte drive for whatever ridiculously cheap price they’re selling for and I’ll have plenty of room to grow.

Now if I want to switch from my VentureReturns VM to my HumanServicesHQ VM, I simply issue the following command.

xm shutdown vr -w;xm create hshq -c

Back on my desktop machine I just switch eclipse from one workspace to another and within about a minute I’m ready to work on an entirely different platform.  How cool is that!

jQuery UI redemption

I noticed today that the jQuery UI folks have gone about correcting the problems with the 1.6 release candidates.  They’ve chosen to tie UI version 1.6 to jQuery 1.2 and create a new UI 1.7 to work with jQuery 1.3.   I think this is a great idea.  Between the CSS changes in the new UI version and the UI tools that were already leveraging new jQuery 1.3 functionallity it was obvious that 1.6 would not work in both jQuery 1.2.6 and 1.3.

I notice tonight that the UI homepage shows jQuery 1.3 next to UI 1.6rc6 and jQuery 1.2.6 next to UI 1.5.3.  It would have been nice if the renumbered 1.6rc6 to 1.7rc1, but I’m at least glad that the site makes it clear what will work with what.

Over at Human Services HQ we’ve stabilized around jQuery 1.3, jQuery UI 1.6rc6, a third party autocomplete, and thickbox.   The only change I had to make to get jQuery UI to allow a calendar inside the thickbox was to add a z-index to the calendar div.  Since then it has been smooth sailing.

Despite the growing pains for jQuery UI this release I remain a big fan.  If you are looking for a lightweight javascript framework jQuery can’t be beat.  If you want some UI widgets to use on top of jQuery then jQuery UI is the obvious first place to look.

jQuery UI is awesome…. mostly…

A lot of pain recently with jQuery UI. They are transitioning to a major new version: 1.6. First, off there is no official jQuery UI release for compatibility with jQuery 1.3. That’s fine, jQuery 1.3 is very new. It would have been nice if they could have released a UI version at the same time, but that’s their prerrogative. Normally I would have just waited for things to settle down anyway.

Unfortunately, one of my projects had the distinction of being one of the few production sites running a jQuery UI 1.6 release candidate. We needed it to fix a problem with the jQuery UI 1.5.3 sortable. It had a bug with nesting inside of a scrollable div that has been fixed in jQuery UI 1.6. So we’ve been watching 1.6 move towards a full release and it has been a bit painful.

We needed to add an autocomplete control to our site. So we looked into it and were horrified to learn that there are basically 4 different versions, all inter-related, all with terrible documentation. What’s worse some of the UI 1.6 release candidates included an autocomplete with a different API convention, but now in 1.6rc6 that autocomplete is gone. I found this out the hard way because generating an earlier rc my page magically broke. It turned out that if you downloaded the complete rc you got the autocomplete. If you built one there was no option for the autocomplete, and they would leave it out.

After I got that fixed I discovered that merely using the new JS wasn’t going to work as jQuery 1.6 needs new corresponding 1.6 css. So now I have to roll a new theme.

Now that I’ve worked with jQuery I refuse to work without it so I’ll have to ride this out, but it has been a painful week for me to see jQuery UI folks stumbling.

End result will hopefully be this:

jQuery 1.3.1
jQuery 1.6rc6
a third party autocomplete (I can’t remember which of the 4, we’re using.)

Update:The above mentioned combination is working for the most part.  Now we’re having a problem where the popup for the date control is showing up behind a thickbox control we’re using.  Hopefully I can find a CSS fix to make it work.

jQuery is awesome

When I started my latest project a friend of mine asked me to look into using jQuery. I had played with MooTools a couple years ago and was mildly impressed, and I figured jQuery would be more of the same.

Boy was I wrong! For what I’m doing jQuery is way better than MooTools!

MooTools is basically two things:

  • A bunch of extensions for the standard javascript classes
  • A set of really nice UI widgets

jQuery is similar, but the approach is a lot different:

  • A tool for grabbing a collection of DOM elements.
  • A mechanism for manipulating the collection once you grab it.
  • And if you use jQuery UI, a set of really nice UI widgets

The core of jQuery is a simple CSS-like syntax for grabbing a collection DOM elements from the page.  It turns out that the ability to grab the DOM elements you need is actually a disproportionately large part of building a modern web app.  Don’t believe me?  Say you need to output a formatted table with even lines colored differently than odd lines.  Tradionally, this would mean crafting a CSS class named “even” and then writing server side code to determine if we were outputting an odd row or an even row.  If it is an even row then we would add a class=”even” to the tr tag.  This would all be done inside of a big loop outputting the table rows.

In jQuery I can just output the table without the “even” class and then add it with javascript after the fact with 1 line of code:

$('table > tr:even').addClass('even')

Boom! No need to write any server side code just to add a CSS class to alternate the background color of a row.

The beauty that is “xmllint”

Up until just recently I thought that there were no available xml validators available under GPL terms. Turns out the the XML Soft people have built a program named “xmllint” that will validate your xml based on a dtd you reference.

So I started looking into XML validation. Up until now it has always seemed like it would be more work than it was worth. Little did I know I would scarcely have to do a thing.

All you need to do to validate your xml is pass it into xmllint with the –valid flag. I believe xmllint is part of the libxml2 suite. It is by the same people. My gentoo machine already had it installed as did a RedHat machine that I use frequently.

Below is a sample XML document and the command line I used to validate it.

test.xml:
<!DOCTYPE article SYSTEM "/articles.dtd">
<article;>

<p>This is a single paragraph article.</p>
</article>

Command line

xmllint --valid test.xml

Notice the “<!DOCTYPE” line? The second parameter is the name of the outermost tag for your document. In my case this was “article”. The “SYSTEM” means that we are validating against our own dtd rather than a well known dtd. The final parameter is a path to your dtd. Thats it

xmllint will return an exit code that tells you how it went. Zero means it worked, nonzero means there were errors. It will also output any errors to stderr. For my purposes I wanted to capture the errors and present them to a web client. Here is the php I used to make that happen.
<?
$cmd="xmllint --valid --noout ".escapeshellarg($filename)." 2>&1";
exec($cmd, $output, $return_code);
?>

There are a couple of items in the above example that I should probably explain now.

  • The –noout option tells xmllint not to output the contents of the file it validates.
  • The escapeshellarg() function is a php function that does its best to make your filenames safe for the command line. You should use EXTREME caution whenever dealing with anything you are going to run through exec().
  • The 2>&1 tells the shell to merge stdout and stderr into one stream. In this case we used it to capture stderr into our $output variable
  • The $output variable is a little quirky. It is returned as an array of lines.

Now that you have seen how easy it is to validate you XML documents, I hope you’ll take the time to validate your XML where appropriate. I know I will be.

Mixing static and dynamic linking

Most of us do nothing but dynamic linking in our small C or C++ programs, but what do you do if you need to use both. I recently found myself in just this situation. The answer seemed to be so obvious to people that nobody had bothered to document it. Here is what I found:

Static linking is actaully really easy to combine with dynamic linking. All you need to do is list the full name of the static library you want to link instead of using the -l option to build it for you. Here is a real world example that I used to link libsqlplus.a (static) with libmysqlclient.so (dynamic).

INC =   -I/usr/include/mysql/ -I/usr/include/sqlplus
WARNINGS = -Wno-deprecated

# Note, libsqlplus is picky about where it builds,
# so I've linked it statically from a known good build.

test: test.cc
    g++ $(INC) $(WARNINGS) test.cc \
       -L/usr/lib/mysql/ /usr/local/lib/libsqlplus.a -lmysqlclient -lz -o test

Note that libsqlplus.a is explicitly listed, while libmysqlclient and libz are just linked in using -l and -L. Not so bad, eh?

Building Mysql C++ Connector

MySQL/AB distributes a super cool C++ database wrapper for mysql that you can use under the terms of the LGPL to develop your apps, but the problem is that they don’t document very well how to build from source. I tried to download the patches and simply pipe them into patch with little success. Turns out that you have to do the patches using some special options.

Read on for the steps required.MySQL/AB distributes a super cool C++ database wrapper for mysql that you can use under the terms of the LGPL to develop your apps, but the problem is that they don’t document very well how to build from source. I tried to download the patches and simply pipe them into patch with little success. Turns out that you have to do the patches using some special options.

Read on for the steps required.First you’ll need to track down the patchs and source. I used the SRPM for RedHat 9. They also have 3 of the 5 patches available for downloading directly from mysql.com. Once you have the patches run this sequence of commands.

patch -p1 -d mysql++-1.7.9 < mysql++-gcc-3.0.patch
patch -p1 -d mysql++-1.7.9 < mysql++-gcc-3.2.patch
patch -p1 -d mysql++-1.7.9 < mysql++-gcc-3.2.2.patch
patch -p1 -d mysql++-1.7.9 < mysql++-prefix.patch
patch -p1 -d mysql++-1.7.9 < mysql++-versionfix.patch
cd mysql++-1.7.9
rm Makefile.in aclocal.m4 build.sh config.guess config.h config.status
config.sub configure install-sh libtool ltconfig ltmain.sh missing
mkinstalldirs stamp* examples/Makefile.in sqlplusint/Makefile.in
libtoolize
aclocal
automake --foreign --add-missing
autoconf
./configure
make

The values in /proc

Ever noticed that nobody on the web documents the /proc filesystem very well. Well guess what. There is a man page for it.

I’ve always been extremely frustrated with the fact that /proc has thousands of values in there and I didn’t know what any of them meant. Well, it turns out that running

man proc

will actually give to the man page for the entire /proc file system. I know this eliminated one of my huge sources of frustration with Linux documentation.