intertwingly

It’s just data

First Impressions: node.js


Edward O’Connor: Fortunately, Node already has an excellent implementation of the HTML5 parser (by Aria Stewart)

I find it rather amusing that the first thing I encounter is a bug.  This bug was quickly addressed, and I’ve verified the fix.

Actually, that was the second problem.  The first was that if I installed node.js from git, npm wouldn’t install.  The symptoms were that npm would download, install to a temporary directory, attempt to install for real, proceed to remove the temporary directory, and then report success.  Downloading the script, removing the code that removed the temporary directory, running it again, going into that temporary directory, and running make manually resulted in a failure message (simply return code of 1 with no other information) which apparently didn’t result in the installation being reported as a failure.

Here is the installation instructions that actually worked for me (backing up to the stable version):

sudo apt-get install g++ curl libssl-dev apache2-utils
wget http://nodejs.org/dist/node-v0.2.6.tar.gz
tar xzf node-v0.2.6.tar.gz
cd node-v0.2.6
./configure --prefix=$HOME
make
make install
curl http://npmjs.org/install.sh | sh

With that fix in place, I was able to proceed to run the test I wanted:

var http  = require('http'),
    html5 = require('html5'),
    jsdom = require('jsdom'),
   window = jsdom.jsdom().createWindow(null, null, {parser: html5});

var rubix = http.createClient(80, 'intertwingly.net');
var request = rubix.request('GET', '/blog/', {'host': 'intertwingly.net'});
request.end();
request.on('response', function (response) {
  var parser = new html5.Parser({document: window.document});
  parser.parse(response);
  jsdom.jQueryify(window, 'jquery-1.4.4.min.js', function(window, $) {
    $('h3').each(function() {
      console.log($(this).text());
    });
  });
});

Observations:

Next time I pick this up I’ll have to try something larger.