intertwingly

It’s just data

Venus Rising


Today I’m making available Planet Venus, which can only be described as a radical refactoring of the Planet code base.

The reasons for a radical refactoring are several.  My primary reason is that I find that I don’t enjoy working on codebases that don’t have an automated regression test suite.  Furthermore, as codebases with such a test suite tend, in my experience, to be more modular; it generally is difficult to bolt on such a test suite afterwards.  On this point, I would glad to be proven wrong.

A second reason is that a number of people have identified memory consumption as a performance issue with the existing planet.  The current design is to read all the content and meta data associated with every post for every feed that you are subscribed to into memory, update it, write it, and then make multiple passes through this data.  While CPU utilization issues can be mitigated with tools like nice, memory issues are a bit harder to address.

A final reason is that there has been an as-of-yet unmet demand to provide for customization.  Conceptually all of the use cases for GreaseMonkey apply equally to feeds, and in particular, the canonical one of wanting to use the Coral content network selectively applies here too.  This is difficult for feeds, not only because of the various feed formats are out there or due to invalid feeds, but also because some elements may contain plain text, escaped HTML, or embedded XHTML.  Having all markup be pre-sanitized and converted to well formed XHTML will all relative URI references pre-resolved makes the job of producing a plugin script much easier.

This is a work in progress, and not really even ready for experimental use just yet.  I’ve been working on it slowly over a period of time, and this week I happened to have extended periods without network access, and this was something I could play with offline.

If you have an existing planet and want to try this out, take your config.ini, change your cache_directory to point to an empty directory, and run the following commands:

python spider.py config.ini
python splice.py config.ini > examples/index.html

While I don’t yet have template support (patches welcome), I do have a sample xslt file that will produce something recognizable.