Widening the Net
The excerpting function seems to be working, and by now I guess that all the people I could have encouraged to add RSS AutoDiscovery information to their websites have done so.
So, now it is time to widen the net a bit. No, I am not going to include the Ultra-liberal RSS locator because I feel that it would be morally wrong to do so. Scouring several (possibly dozens) of sites for information after a human enters text into an entry field is one thing, but doing so automatically once an hour for each referrer is another.
So here is what I have implemented so far. If I retrieve a page
and it has no appropriate link tag, then I will scan for <a>
tags with hrefs that point to the same site and end with a file name
that is commonly used for rss. The ones I have come up with so far
are: rss.xml
, index.xml
,
index.rdf
, and ?flav=rss
. The first one I
encounter will be used - so there will only be one attempt to fetch
an RSS feed per site per hour.
If you know of another common convention, leave a comment. If your site doesn't follow a common convention, consider adding a <link> tag to your site.