It’s just data

rFeedParser

Jeff Hodges: I was recruited after Bob Aman of FeedTools fame saw me hyping my translation of Mark Pilgrim’s FeedParser from Python to Ruby, and thought it was pretty good.  The translation, of course, is called rFeedParser and it really is pretty good.  I’ll have a post on that soon.  First, I want to fix the silly options bugs that I was turned on to a little while ago.

I’m not sure how I missed this before, but a

sudo apt-get install libxml-parser-ruby1.8
gem install rfeedparser

And I’m up and running.  Things worth exploring:


Hopefully Python’s Feedfind will be ported to Ruby in due course as well...

Posted by Damon at

Heh, don’t worry about not having seen it.  It was just dumb luck Bob happened to be in the chat at the same time I was dropping information about it in #ruby on freenode.net.  I rarely mentioned it anywhere while I got it somewhere near stable.  Bob was only really paying attention because he and I had emails back and forth with him when he was still developing FeedTools. 

You might be interested to know that I just wrote a way-too-big post detailing some aspects of rfeedparser.  So, there’s that. 

The feeddate() idea is great and I’ll have py2rtime renamed to that in a later release (while keeping py2rtime as an alias for a while). 

The “\unn” and “\unnnn” format for non-ASCII characters in Python (along with u'' and u"") provide special challenges to someone trying to work with the tests and careful thought would have to go into “fixing” it.  While the character-encodings gem rovides a u'' sort-of-work-a-like, it certainly doesn’t interpret the “\unn” or “\unnnn” as hex codes.

There are other Python specific bits in the tests I didn’t mention in that large post, such as 1 and 0 being true and false, len(), None instead of nil, the use of tuples instead of lists, triple quoted strings, and the differences in syntax between dicts and Hashes.  Pretty much every Regexp in scrape_assertion_string in rfeedparsertest.rb is a possible “issue”.

The None/nil, and len() problems can be solved by simply writing up a spec saying “we expect an reference called None that is acted on like ‘nil’ in Ruby and ‘None’ in Python, yadda yadda”.  The rest, though, will require more changes to the actual unit tests similar to the feeddate() idea.  All this assumes Mark is up for it, of course. 

I tried to get a hold of him a couple of months ago trying to figure exactly which license feedparser was under (or even if it had a name), but it looks like my email fell into a black hole. As a result, I don’t have a good way of reaching him.  (Considering how often I see him here, it might be this very comment thread..)

Oh, and if you happen to know anyone who knows anything at all about iconv and how it expects encodings to be written, feel free to point them in my direction.  I’m seriously considering writing up a “standard” iconv-encodings package so that rfp can actually work consistently across OS X, Linux, etc. but this is deep dark magic to me.  I got as far as trying to follow the Ubuntu/Debian build of glibc to see where everything came together, with no luck.  It looks like I might be tilting at windmills.

Posted by Jeff Hodges at

All this assumes Mark is up for it, of course.

I am a committer to feedparser.

iconv and how it expects encodings to be written

Try:

iconv --list
Posted by Sam Ruby at

Well, I’m kind of dumb for not knowing that you were a committer. And doubly dumb since I just looked at the Google Code site yesterday.  Yeesh.

“iconv --list” is about as useful as a yak hair on a iguana for this.  I’ve taken a look at the libiconv code and found it wanting (at least, for the current formulation of the iconv-encodings idea).

Posted by Jeff Hodges at

My bad.  I meant to send you a link Sam, since I knew you’d be
interested, but I was busy getting stuff sorted for my trip to Africa,
and it slipped my mind somehow.  Samahani!

Incidentally, it’s not so much that I stopped working on FeedTools as
it is that I started working on a different parser in C instead.
(Which is why I keep cheering anytime someone considers writing a tag
soup or html5lib port for C.)  But basically everything code-wise is
on hold until I’m back State-side.

Posted by Bob Aman at

is this anything?

[link]...

Excerpt from del.icio.us/minutillo at

New URI::Template release with generic test suite

URI::Template 0.08_02 should be hitting CPAN shortly. This release conforms to the latest uri-template spec released this month. I’ve always been interested in portable/generalized test suites. Sam Ruby mentions that he’d like to see...

Excerpt from LTjake's Journal at

Hi,

Nice post and I’m using it nice in Ubuntu. But my VPS is CentOS. Have you installed in a CentOS box? I don’t know how to install sudo apt-get install libxml-parser-ruby1.8 in CentOS. The package is not found.

Thanks for your help

Genis

Posted by Genis at

Add your comment