xml5.js

2011-01-13T17:48:27Z

I’ve posted a rough beginnings of an implementation of xml5 for node.js. The core of this work is the tokenizer, for which I wrote a simple script to do the conversion of Anne van Kesteren’s implementation of the parse state methods to the style that Aria Stewart used for html5. Pretty much the remainder was “borrowed” from html5.

While this is not yet complete, you can see how it parses and dumps simple files with the following command:

node parse.js filename

Plenty still needs to be done. In particular:

When I run the above command, I get the following message which I will want to suppress:

###########################################################
#  WARNING: No HTML parser could be found.
#  Element.innerHTML setter support has been disabled
#  Element.innerHTML getter support will still function
#  Download: http://github.com/tautologistics/node-htmlparser
###########################################################

Accessing node names via DOM methods results in UPPERCASE values. I want to preserve case. For the moment, I do a case insensitive match on end tags with start tags. Thinking about it, that may be worth spec’ing and retaining.
As Anne indicated, it would be ideal to accept the full range of HTML5 entity names.
Namespaces in particular and the test suite in general need to be addressed