Reality is Corrosive

By Sam Ruby, May 4, 2004

I’m growing increasingly convinced that “leaky abstractions” doesn’t go far enough.  The problem isn’t so much that abstractions are porous.  They are.  But it seems like there is something more than that.  Much deeper.  Much more insidious.

A beautiful theory, killed by a nasty, ugly, little fact.
 - Thomas Henry Huxley

Let’s explore a few of these nasty, ugly, little facts.

Can’t Get There From Here

All nodes in an IP network have a routing table.  At the edges, these tables can be quite simple: all outbound data goes here.  Interior nodes are not that much more complicated: they implement simple rules like data destined for IP addresses starting with a 9 goes thisaway, everything else goes thataway.

DNS abstracts these numbers away.  In their place are human friendly names.

HTTP URLs add a hostname (and optionally a port) and a path (and optionally a query) to the scheme in order to form a URI.

WSDL allows one to bind a soap endpoint to an http address.

Let’s play that back, backwards.  A SOAP endpoint is associated with a URI.  If the scheme of that URI is “http”, then the uri is further split into a host and a path.  The host is then mapped into an IP address, and a request containing simply the path and the payload is then sent to this address.

What you will find is that you can implement this absolutely correctly – IP, TCP, HTTP, and SOAP, compete with all the necessary unit tests – and yet still find that your message may never get to the desired destination.  This actually occurred in the SOAP Interop activities.  The problem is virtual hosts.  It turns out that one can’t simply map a DNS name to an IP address and proceed from there; the original DNS name must somehow tag along with the request. 

That’s the first leak in the abstraction.

It gets worse.  Look at the original definition of a HTTP request-URI.  “The absoluteURI form is only allowed when the request is being made to a proxy”.  In other words, there is a place for exactly this information in the core protocol, but a restriction is placed on its usage that makes it impossible to be used in this manner.

The solution is to use an extension element – a HTTP Host header – to supplement the information in the base request.  One that isn’t even documented in HTTP 1.0.  Something that is corrected in HTTP 1.1.

Direct Deposit

When you send a message, you address it to its recipient.  In the case of a phone call, you enter a phone number.  When mailing in a check to pay your electric bill, you address the utility.  Quite different levels of granularity.

Which is right?  It depends on the application.  As a general rule, exposing as much as is possible in the address allows for efficient routing.  Unfortunately, these abstractions break down.

When calling to the US, a phone number consists of a country code, an area code, a local exchange, followed by a number.  With the advent of phone number portability, the concept of an exchange is destroyed.  Area codes now overlap.  How long do you think it will be until people can take their phone numbers with them when they move internationally?

When depositing cash in a bank, you fill out a deposit slip.  In addition to the obvious routing information such as account number and type (savings vs checking), it turns out that the amount is very much relevant in the routing of the transaction.  Depositing $250 in cash is a very different activity than depositing $250,000 cash, and has very different handing requirements from a business and legal perspective.

So, do you put the account number and the amount on the outside of the envelope?  It seems that business rules and constraints have a way of cutting through abstractions.

Does anybody really know what character this is?

One of the improvements in HTTP 1.1 is a more careful consideration of character sets.

Unfortunately, some older HTTP/1.0 clients did not deal properly with an explicit charset parameter. HTTP/1.1 recipients MUST respect the charset label provided by the sender; and those user agents that have a provision to "guess" a charset MUST use the charset from the content-type field if they support that charset, rather than the recipient's preference, when initially displaying a document.

OK, so the first problem is that some people don’t know how to read specs.  And unfortunately, there seems to be enough of them out there to make the concept of a default character set for a HTTP message an entirely meaningless concept.  So, for the moment, let’s limit the discussion to messages which explicitly specify a character set.

Apparently, the premise is that clients can specify the desired character set, and some software in the “middle” (perhaps the web server, perhaps an intermediary) will transcode the message.  Rumor has it that this actually happens in Japan.

What this is saying is that the character set is something the “application” is explicitly expected to yield control over to the “system”.  I’m not going to define “application” or “system” here, I’m merely intending to convey the notion that there is some layering involved, and that an implicit assumption of HTTP is that character encoding is not controlled at the top most layer.

Now suppose that you have an application that DOES care about character encoding.  One such application is XML, and the results are insanely complicated.

That didn’t work out so well, so let’s take a simpler example.  An HTML Form, perhaps.  Somehow the provider of a form need to express a preference for character set it expects responses back in.  You would think that accept charset would be ideal for this, but inexplicably, this does not seem to be widely supported.  Instead, the character set used for transmission is treated an indicative of the desired response.

Now, remember, this is not something that the application necessarily can control.  The only reliable way to ensure that this data will not only get sent by your server but will also not be stripped by the layered protocol stack on the receiving side is to place this data inside the payload.

Enter the http-equiv attribute.  In this way, HTML applications can provide information which should have gone into HTTP headers, but for various reasons, isn’t. Want to see it in action? View source on Google

Let’s recap: HTML transported over HTTP has three separate ways to specify the desired charset to be used when constructing a form request.  (Note: XHTML introduces a fourth way).  Of all these ways, the most reliable and well supported mechanism is to tunnel the HTTP data inside a HTML header in a meta tag.

Passwords

Sending passwords in the clear is just plain dumb.  Encoding passwords prevents eavesdroppers from determining your original password.  Much better.

Oh, wait.  If they can snarf your password, the can then use the encoded password in place of your original password.

Oh, crap.

One can use Nonces to ensure that what can be observed can't be reused, and this prevents non-intrusive capture methods, but doesn't prevent a more sophisticated attack whereby your transmission is intercepted and the payload is replaced.  There are times that this may be important.

OK, so to do better, one needs to sign the content, not just the password.  At this point one becomes awash in a sea of acronyms: x509, PGP, IANA Kerberos, RSA, etc.  One set of approaches require the establishment of a secure virtual circuit between the source and destination.  Another set of approaches are more suited to one time asynchronous messages and legacy transports: signing the message itself.  The second approach sounds easier, so lets go with it.

What kind of messages are we likely to be sending?  How about XML… seems to be all the rage.  Rumor has it that some of the advocates are pedantic about syntax, so this seems to be a pretty safe bet.  But this turns out to be a lie.  XML consumers generally don’t care whether an ampersand is “spelled” as & or & or &.  Or whether you use iso-8859-1 or utf-8, as long as you declare it and it is supported of by the parser.  Or about the amount of whitespace separating attributes in an element.  Or even the order of attributes.  In fact, there is a whole list of things they don’t care about.  But none of these affect the signature, do they?

Oh, crap.

OK, so let’s create a W3C working group to create a canonical representation of XML and define how such signatures are to be put into messages

Meanwhile, it seems that people are defining ways to put the exact thing that you should be signing into something called an attachment.

Expletive deleted.

Separating Presentation and Data

Cascading Style Sheets (CSS) are a mechanism for adding style (e.g. fonts, colors and spacing) to HTML documents.  One of the key precepts is that one should separate content from presentation.

One concept that is not present in CSS is any notion of targeting information for a specific browser.  This is a hotly contested debate.  From an abstract or conceptual point of view, placing browser specific information inside of a CSS file is the wrong thing to do.  From a practical point of view, it creates very real problems.

Despite this, people exert incredible energies to develop and catalog such hacks.

Why?  Simply put, such things – while being very, very, wrong and create more problems – are very much necessary.

recommendations

Additional examples

Producing an exhaustive list here is very impractical, but from time to time, I plan to return here to add links to others I can find on the web:

Another possible list to start collecting would be abstraction shattering concepts.  For example, HTTP’s notion of ETag and Last-Modified both require layer piercing bullets in order to be implemented correctly and completely.  Much easier to design from the beginning if you are proceeding outward-in, but a bitch to retrofit if you are proceeding inside-out.

Search

Valid XHTML 1.1!