It’s just data

Weblog Software Rewrite Underway

I’ve clearly been neglecting my little spot on the web.

It has gotten so bad that Brendan Eich had to link to a web archive copy of a page of mine.  I must say, however, that it is very ironic and amusing that it is was that particular page.  General outline of my current approach:


WHATWG/W3C Collaboration

I’ve been having fun working on the URL Living Standard. All good things must come to an end. Now it is time to spell out a path forward.


The URL Mess

tl;dr: shipping is a feature; getting the URL feature well-defined should not block HTML5 given the nature of the HTML5 reference to the URL spec.

This is a subject desperately in need of an elevator pitch.  From my  perspective, here are the three top things that need to be understood.


HTML5 Mode Links

Based on a suggestion by Tim Bray, I converted my board agenda Angular.js application to use html5 mode.  The process was straightforward:

1) add the following to your application configuration:


2) Add a <base> element to my generated HTML, indicating which part of my path was “owned” by the server.

3) Convert my relative links.  Based on how my application was structured:

I’ve not yet tested it with Internet Explorer <= 9, but the Angular.js docs indicate that it should work there too.

Software in 2014

Tim Bray: We’re at an inflection point in the practice of constructing software. Our tools are good, our server developers are happy, but when it comes to building client-side software, we really don’t know where we’re going or how to get there.

While I agree with much of this post, I really don’t think the conclusion is as bad as Tim portrays things. I agree that there are good server side frameworks, and doing things like MVC is the way to go.

I just happen to believe that this is true on the client too – including MVC. Not perfect, perhaps, but more than workable. And full disclosure, I’m firmly on the HTML5-rocks side of the fence.


Ruby bindings for Gumbo HTML5 parser

Jonathan Tang: We’re pleased to announce the open source release of the Gumbo HTML parser, a C implementation of the HTML5 parsing algorithm.

I’ve posted a proof of concept Ruby binding to github.

In defence of Polyglot

I see that Henri Sivonen is once again being snarky without backing his position.  I’ll state my position, namely that something like the polyglot specification needs to exist, and why I believe that to be the case.

It makes sense for authors who may produce a handful of pages to be processed by an uncountable number of imperfect tools to agree on restrictions that may go well behond the minimal logical consequences from normative text elsewhere if those restrictions increase the odds of the document produced being correctly processed.

Such restrictions are not a bad thing.  In fact, such restrictions are very much a good thing.


Taming the wild, wild web

Bill McCoy: EPUB in effect takes the Wild, Wild Web and tames it. EPUB for example requires use of the XML serialization of HTML5 (XHTML5), rather than “Tag Soup” aka “Street” HTML. This means that EPUB content, unlike arbitrary web pages, can be reliably created and manipulated with XML tool chains. EPUB defined Reading System conformance more tightly than HTML5 defines for browser User Agents, pinning down things that are under-specified in the union of W3C standards. [via Patrick Mueller]

Wunderbar now does Sinatra



The result is a lot like Markaby, except you get to be/have to be explicit when you are creating a tag.  In this demo, there is no logic, so the benefits of doing so are less clear, but include you being able to use tags that aren’t known to Markaby, like the ones that were added in HTML5.  Both inline and views are supported, but support for layouts has yet to be added.

Future plans include Rails.


No more "XML parsing failed" errors

Andreas Bovens: we’ve decided to stop throwing draconian XML parsing failed error messages, and instead, attempt to reparse the document automatically as HTML.

W3C License Poll

The W3C HTML Working Group recently had a preference poll on which license should be used for the HTML specification.  As directed, the W3C PSIG prepared three non-forkable license options.  Additionally, Mozilla provided two forkable license options.

For better or worse, the W3C is a member organization.  I’ve broken out the results by affiliation.


Implementing Open Standards in Open Source

Lawrence Rosen: Specifications are different from software, but they are weapons in the competitive software wars and they are subject to legal control by contract and by law. Companies try to control specifications because they want to control software that implements those specifications. This is often incompatible with the freedom promised by open source principles that allow anyone to create and distribute copies and derivative works without restriction.  This article explores ways that are available to compromise that incompatibility and to make open standards work for open source.


Elijah Insua: require('jsdom').env('',function(e, w) { console.log( w.document.getElementsByTagName('a').length, ‘node releases!’ )})

weld also looks promising.  For my use case, I would like to have it be able to handle hash values which implement the DOM Element Interfaces.


Charter Extension

Philippe Le Hégaret: Starting in March, W3C will dedicate new staff to drive development of an HTML5 test suite.

Also: W3C Confirms May 2011 for HTML5 Last Call, Targets 2014 for HTML5 Standard.

Breaking the Web with hash-bangs

Mike Davies: So the #! URL syntax was especially geared for sites that got the fundamental web development best practices horribly wrong, and gave them a lifeline to getting their content seen by Googlebot.

Helping Users Install WebM Support

Henri Sivonen: When you publish WebM content, instead of explaining which browsers support WebM, you can simply link to and it will detect if the user’s browser supports WebM. If the browser doesn’t support WebM, the page will suggest upgrading the browser to a new version that supports WebM, installing a WebM decoder if the browser supports 3rd-party decoders and one is available, switching to another browser or using another operating system (as applicable and in that order).


John Cowan: I’ve been developing a parser for MicroXML which I have dubbed MicroLark, in honor of Tim Bray's original 1998 XML parser Lark. I didn’t take any code from Lark, but we ended up converging on similar ideas: it provides both push and tree parsers (as well as a pull parser), it is written in Java, and I intend to evolve it as MicroXML evolves.

I’ll openly admit at this point that I’m skeptical about the prospects of MicroXML.  I continue to be more hopeful about XML5.  That being said, my own personal efforts have stalled for the moment, at least as they relate to node.js.


HTML5 logo

Ian Jacobs: W3C unveiled a logo for HTML5 today. HTML5 in the broad sense covers many different technologies at varying degrees of standardization and adoption. Commercial sites have begun to take advantage of some of the technology, and we are excited that this logo will help raise awareness about HTML5 and W3C.

Update: Ian Jacobs: The most unified criticism has centered around the FAQ's original statement that the logo means "a broad set of open web technologies", which some believe "muddies the waters" of the open web platform. Since the main logo was intended to represent HTML5, the cornerstone of modern Web applications, I have updated the FAQ to state this more clearly. I trust that the updated language better aligns with community expectations.


I’ve posted a rough beginnings of an implementation of xml5 for node.js.  The core of this work is the tokenizer, for which I wrote a simple script to do the conversion of Anne van Kesteren’s implementation of the parse state methods to the style that Aria Stewart used for html5.  Pretty much the remainder was “borrowed” from html5.

Plenty still needs to be done.


Chrome -= H.264

Mike Jazayeri: Though H.264 plays an important role in video, as our goal is to enable open innovation, support for the codec will be removed and our resources directed towards completely open codec technologies.