intertwingly

It’s just data

MicroLark


John Cowan: I’ve been developing a parser for MicroXML which I have dubbed MicroLark, in honor of Tim Bray's original 1998 XML parser Lark. I didn’t take any code from Lark, but we ended up converging on similar ideas: it provides both push and tree parsers (as well as a pull parser), it is written in Java, and I intend to evolve it as MicroXML evolves.

I’ll openly admit at this point that I’m skeptical about the prospects of MicroXML.  It doesn’t contain enough of XML to correctly parse feeds (be they RSS 1.0, RSS 2.0, or Atom).  It doesn’t contain enough of XML to correctly parse the full range of content of XHTML5 (when you include MathML and/or SVG).  It doesn’t contain enough of XML to correctly parse the full range of SVG content.  In short, I don’t know what the use cases are for MicroXML.

And that only covers well-formed content.  In the HTML and feed worlds, ill-formed content is rampant.  So if MicroXML is in any way a reaction to HTML5, in my opinion it picks the wrong lessons to learn from.

I continue to be more hopeful about XML5.  That being said, my own personal efforts have stalled for the moment, at least as they relate to node.js.  jsdom is buggy, I’ve yet to get responses to email, tweet, or even post to the mailing list.  I’d fork the project myself, but if you read to the bottom of my post, I can’t even seem to figure out how to run the tests.

For the moment at least, I fail.