intertwingly

It’s just data

Planetary Exploration


The original Planet was simply named Planet (not Planet Planet despite what the web site says).  It was originally created by Scott James Remnant and Jeff Waugh.  My original interest was in ensuring that it had proper Atom support.  Over time, I became the defacto primary maintainer.

Planet’s strength was that it built on Mark Pilgrim’s feedparser which converted any feed into a lazy dictionary, and Tomas Styblo’s templating engine which converted a static file and a dictionary into HTML (or whatever format you like).  So if you do the math, feed plus static file (plus some configuration) gives you a site.  State information was managed using a dbhash and a cache directory.

While small, it became difficult to maintain (pretty much all of the mainline logic was in the __init__ file), and I had dreams of introducing filters.  So, I embarked on a radical refactoring.  In the process, I eliminated the need for the dbhash.

With this new design, I was able to create a number of filters, but the way content was handled was always an issue.  All content was simply serialized as a string.  This meant that sanitization required parsing the content into tokens, removing or modifying nodes, and re-serialization.  Expanding relative URIs involved the same process.  Extracting microdata involved the same process.  Etc., etc..  Each introducing the possibility of mangling such things as mathml.

Mars took a different direction.  Instead of a dictionary, there was a DOM.  Feed elements were placed into the DOM.  Content elements were in the DOM.  You can iterate over everything.  Everything is only parsed once.  Everything is only serialized once.  A much cleaner design... if you like XSLT templates.  If you like more traditional templates, like haml, all this ended up doing was moving the problem of converting the DOM into a dictionary/hash to another place.

Now that I’m exploring node.js, I have the opportunity to revisit this once again.  Since I have access to jquery, I should be able to eliminate the pesky conversion of a DOM into a format usable by a templating engine problem.  I should be able to pass in a single value, named $, which contains a set of entries to be iterated over.

As this is a journey, I’m not sure where this will end up, or if it will end up with anything useful at all.  Perhaps it will end up with a more scalable and dynamic server that ties into pubsubhubbub.  Perhaps the planet software itself will move from the server to the client and take advantage of web workers and local storage.