intertwingly

It’s just data

Parsing Atom with Erlang


A simple program for parsing memes.atom.  Below is an annotated version.

-module(memes).
-export([scan/0]).
-include_lib("xmerl/include/xmerl.hrl").

Define a module named memes that exports a single function named scan which takes zero parameters.  Include the headers for xmerl, a library for processing XML.

memes_url() ->
  "http://planet.intertwingly.net/memes.atom".

Define a simple function that returns a constant string.  Some people prefer to use macros for things like this.

scan() ->
  application:start(inets),
  { ok, {_Status, _Headers, Body }} = http:request(memes_url()),
  { Xml, _Rest } = xmerl_scan:string(Body),
  format_entries(xmerl_xpath:string("//entry",Xml)),
  init:stop().

Main program

format_entries([]) -> done;
format_entries([Node|Rest]) ->
  [ #xmlText{value=Title} ] = xmerl_xpath:string("title/text()", Node),
  [ #xmlAttribute{value=Link} ] = xmerl_xpath:string("link/@href", Node),
  Message = xmerl:export_simple_content([{a,[{href,Link}],[Title]}],xmerl_xml),
  io:format('~s~n', [xmerl_ucs:to_utf8(Message)]),
  format_entries(Rest).

In lieu of looping constructs, Erlang programs tend to use sequential logic and pattern matching.

Clearly dumping XHTML fragments to stdout isn’t ideal (perhaps XHTML-IM instead?), and you wouldn’t want to dump every meme on every run, but those are problems for another day.