intertwingly

It’s just data

FeedDiff for Roller


Yesterday, I had lunch with Dave Johnson.  He asked me how hard would it be to add support for the RFC 3229 "feed" instance manipulation method to Roller.  I said that I would take a look into it.

As an aside, I find designing caching logic some of the most interesting work.  To properly design a cache, one must understand the overall solution, desired policies, be able to make system level trade-offs of performance vs bandwidth vs storage, and be prepared to pretty much violate every software engineering principle related to encapsulation there is.  The existing roller pagecache implementation is no exception.  It currently has to worry about things like language, security, and even knows which changes affect feeds.

My suggestion is to initially not change any of this, but to define a new cache (leveraging the existing LRUcache implementation) specifically for current and prior versions of feeds.  A rough sketch of a design for a filter follows:

ETag = hash(feed)

setHeader("ETag", ETag)
 
if not mFeedCache.contains(ETag):
  mFeedCache.put(ETag, feed)
 
if not getHeader("If-None-Match"):
  return full feed
 
if parse("If-None-Match").contains(ETag):
  return status 304 (Not Modified)
 
if getHeader("A-IM"):
  if parse(getHeader("A-IM")).contains("feed"):
    for (String tag: parse(getHeader("If-None-Match"))):
      if feedCache.contains(tag):
        setStatus(226)
        setHeader("Vary", "If-None-Match")
        setHeader("IM", "feed")
        getWriter().write(diff(feed,mFeedCache.get(tag),"entry"))
        return

FeedDiff contains suggested implementations for diff, hash, and parse.  Note: none of this logic is particularly Atom specific.

Tim Bray has said that he would write a bloody Internet Draft myself for the Atom WG if nobody else does.  Perhaps somebody in the rss-dev working group or RSS Advisory board would consider doing likewise for these feed formats?