It’s just data

Disappearing Silverware

It looks like the summary attribute issue is finally well on its way to being closed.  Recent events: John’s position, Ian’s position, Maciej’s mediation, Ian’s acceptance, John’s withdrawal of his draft.

One of the key take-away’s from all this is something Maciej said: The most effective way, I believe, is to actively encourage a constructive alternative.  In general, blocking the work of others is something I plan to discourage.  If that ends up making HTML5 more like Perl than Python, so be it.  After all, those that prefer rigid alternatives still have that option.

Previously I said that there may be as many as four drafts by the end of the last month.  Here we are in August, and we are still holding at one draft.  I firmly believe that enabling forks is the best way to prevent such.  The clearest explanation I have ever heard as to why that premise applies so well in the context of spec writing was made by Joe Gregorio: Camera-Ready Copy and the Social Denial-of-Service Attack.  My experience is that people tend to become reasonable (or at least throttle down the urge to be unreasonable) if you set the expectation that they need to make a concrete and constructive proposal.

Accordingly, I plan to continue enabling such potential forks in the hopes that doing so will end up facilitating the amicable resolution of issues.  During the course of this discussion, I was able to assist a number of people with the preparation of their text for publishing, even though none actually got published.  The remainder of this post is about that process.

You, too, can be a Publisher

The W3C uses cvs.  The WHATWG, subversion.  Anolis uses mercurial.  Manu, git.  I happen to have each installed already, so none of this is a problem for me, but I can see how it is intimidating.

I’ve looked at Manu’s build system, and despite the pretty pictures, I have to say that I don’t quite grok the approach.  I’m familiar enough with the tools that I see how it all works mechanically: it is the overall approach that seems to me to be problematic.  The README tells you how to build Manu’s draft, but not how to create your own.  From what I can figure out, in order to create a new draft you need to create a configuration file listing all the sections you want (adding and removing as you see fit) and modify a make file.  If you want to make a one line change to a section, you need to snapshot that section.

What happens if Ian adds a section or modifies one of the sections you snapshotted?  Unless I’m missing something, you need to detect this yourself and merge.  I’m sure this can be automated above the level of the configuration management system, but isn’t that what the CM system itself is good for?

I’m sure that Manu will solve these issues, but for the moment, I’ve been taking a more direct route.

Recently, a few members of the PF Working Group sent me an updated draft based on Maciej’s suggestions.  All they did was to take Ian’s latest source, edit it using the editor of their choice, and send it to me.  That editor made a number of changes: specifically it added a BOM, a minimal html head section, wrapped the content in a body, closed the body and html tags at the end, and converted Unix style LF into DOS style CRLFs.  Sounds like a lot, but essentially the content in the middle was intact: neither a quote nor a slash was added in the process to sections that weren’t changed.

My workflow was simple.  I manage the source in git.  I manage the documents produced using the W3C’s cvs.  And to produce the output from the source, I use the Anolis tool created by Geoffrey Snedders and a spec splitter script originally developed by Philip Taylor.

For the input, I created a branch named “summary_compromise”, and placed the file I received there, over top of the file named source.  The only thing I needed to be careful of was to first checkout the same version of the source that they had used has their base, that way the internal deltas will be correct.  I then used dostounix to get rid of the pesky CR’s, and git add --patch to identify which “hunks” to commit (I omitted the front matter and back matter, and accepted the rest).  Once I was done, I committed the result.

I can even apply Ian’s subsequent changes via a git rebase master.  What happens if Ian adds a section?  Not a problem.  Or if Ian removes a section other than the one the PF group modified?  Not a problem.  What if Ian modifies different lines in the same section?  Again not a problem.  What if Ian modifies or removes a line that the PF group modified or removed?  Then and only then a merge would be required, but even there, git has tools to assist with this.  Fetching, rebasing or merging with a branch from Manu or anybody else would be equally as straightforward.

As to the processing, what’s required is that you start with a standard W3C header (modifying the date), append the source and pass the result to Anolis with a few options.  I was only able to find the options and header in Manu’s repository, so I went with those.  To make things easier for me, I created a small script and configuration file which enables me to do all this with one command: all I need to specify is the name of the branch.  The script even calls on the result to split the large file into smaller sections.  Committing the output to CVS would have been sufficient to trigger publishing on the W3C web site, but as I said, in this case I never had to get to this point.  In fact, I never published my local commits to any public git repository.

While I am glad to help out, longer term I have zero interest in becoming indispensable in this way or establishing and retaining a position in the critical path for publishing.  I will help anybody out there who is capable of finding the command line and installing Anolis on their machine through the rest.

For the record, if you want to skip installing Anolis locally you can use it as a web service. If someone tells me which options to set I can make the default mode of that service “Do what Hixie does”.

Posted by jgraham at

The options I am currently using are from Manu’s, namely:

--w3c-compat-xref-a-placement --parser=lxml.html --output-encoding=us-ascii --allow-duplicate-dfns
Posted by Sam Ruby at

Since your comment system enforces OpenID, but then doesn’t like my implementation, and crashes, I’ll try to remember my comment:

You misrepresent the situation. This item is not nearing closure. It’s no one wants to have a battle out over a Working Draft.

All that’s happened now is that everything is going to hit the fact at Last Call time. You’ve not solved problems, you’ve only pushed them out.

As for the source code control thing: technology does not solve problems when it comes to people working through differences. It just raises barriers, and filters out voices. I would think after all these years, you would understand that by now.

Posted by Shelley at

I’m familiar with using VCS tools, but the practical steps involved in “forking” the spec are still confusing to me, even after following the various mailing list threads and reading this entry. I’m not sure whether this has to do with the process being too complicated or me being as dumb as a bag of hammers (probably the latter).

Here’s the things that are confusing:

1) Is there a “preferred” source control tool for this task? W3C uses cvs, WHATWG uses Subversion, Sam and Manu use git. If I’m coming to this as a tabula rasa, which tools should I be using?

2) Where is the canonical repository from which I should be checking things out? The W3C repo, the WHATWG repo, somewhere else?

The next step seems clear — based on this blog entry, it sounds like once the files are checked out I make my edits in the file named “source”, then use Anolis to turn that source into formatted output (right?). But then...

3) Once I’ve got my formatted output, how do I submit it for consideration?  Do I check it in at the W3C CVS repository (as a branch, or in some other way)? Or do I host it myself, a la Manu?  If so, what’s the way to put it before the WG?

My Invited Expert membership in the WG has lapsed, so while explanations of these points don’t do much good for me, I submit them in hopes they might help others...

Posted by Jason Lefkowitz at

The W3C CVS repository contains published documents.

The WHATWG SVN repository contains a source.  I have a git clone (essentially an identical copy) of the WHATWG SVN repository, which is resynched hourly.  Manu’s repository doesn’t contain the WHATWG sources, instead it contains deltas from that source, and a build process that allows you to fetch (and split, and reassemble) the WHATWG source, adding, removing, or replacing micro sections at will.

If you rejoin the HTML WG, and provide an ssh2 key, you can get access to the the W3C CVS repository.  Or, if you prefer, I or somebody else can provide assistance.

Ultimately, you could simply start with copying a finished document, but maintenance may be an issue if you take such an approach.

Posted by Sam Ruby at

OK, that’s helpful — thanks!  It sounds like the answers are:

1) git

2) WHATWG svn, which those wanting to edit should clone via git to have their own working repo in which edits are made

3) W3C CVS, which WG members have access to

Now I will go see what’s involved in renewing my membership...

Posted by Jason Lefkowitz at

For 2, you might find it easier (as in quicker: my first time took 80 minutes to process the conversion) to clone my git copy as it has the same data, and is already in git format.

I look forward to you rejoining....

Posted by Sam Ruby at

Discovered another spec-template, this one by Cameron McCormack.

Posted by Sam Ruby at

Sam Ruby: Disappearing Silverware

Sam Ruby: Disappearing Silverware Thu 06 Aug 2009 at 11:13It looks like the summary attribute issue is finally well on its way to being closed.  Recent events: John’s position, Ian’s position, Maciej’s mediation, Ian’s acceptance, Jo... vantguarde...

Excerpt from vantguarde at

Another piece to the puzzle.

Posted by Sam Ruby at

Now on github

Posted by Sam Ruby at

Add your comment