It’s just data

Sausages and Uncertainty

Scales of Justice. Based on work Copyright 2007 by Ken A L Coar. All rights reserved. The design and this SVG rendition are protected by copyright law, and may not be used or reproduced without the express permission of the author, coar@apache.org.

I’ve often found lawyers frustrating.  No matter how carefully you craft a question to only permit answers of yes or no, they always seem to find a way to pick door number 3.

Given that, I should have known better in July when I volunteered to take over a vacancy as Chair of the ASF Legal Affairs Committee when Cliff Schmidt decided to devote more of his time to Literacy Bridge.  And I certainly should have known better than to volunteer to take an unfinished third party licensing policy to completion.

Fast forward to yesterday.  We had an ASF members meeting.  You can see the board results here.  New members were elected too — those names will dribble out as they are informed and (hopefully) accept.

At that meeting, the tables were turned.  Instead of it being me crying for a simple yes or no answer, a number of members, led by Stefano and Ben led the charge and came after me complete with torches and pitchforks.  OK, so I’m exaggerating slightly.  There were no torches.  And only really tiny pitchforkes.  Actually they weren’t pitchforks at all — more like Monty Python-esque taunting.  Oh, and it was not directed at me, exactly.  Just at the lack of closure.  On what clearly must be a series of simple yes and no questions.  I mean really.  For example, is the Creative Commons Attribution license version 2.5 compatible with the Apache License version 2.0?  Surely that is a yes or no question, right?  Actually, no.  But we can quickly come up with a set of guidelines that everybody can live with.  And, after all is said and done, isn’t that what everybody really needs?

But I digress.  Where was I?  Oh, yes, the meeting.  Luckily I had prepared in advance.

My plans here on out is to push for Category X licenses as well as the transition examples to be added to the resolved legal questions.  And to state that the work on best practices and specific limited exemptions for all other licenses (effectively all the licenses known to be in category B, and all licenses yet to be explored) is ongoing.  And with that jedi-like hand wave coupled with the Apache secret weapon: namely an open invitation for all those who are affected by this to join legal-discuss and help work out the issues (also known as the where’s your patch? or thanks for volunteering defense), the villages will once again be peaceful.

Wish me luck.  Oh, and don’t tell anybody about my secret plan.  Nobody reads my blog anyway.

And if any of you out there are lawyers: I’m sorry for the trouble I’ve given you in the past.


Luck.  You’ll need it.

Explaining IP to software developers is as fun as explaining unit testing to lawyers.  But theres an interesting overlap between these two worlds.  Developers base their knowledge on judging how the end result stands up to the test of time: bugs, maintenance, performance, scalability, migration, etc.  Lawyers, by how well it stands the test of court.  So when a simple yes/no fails, maybe a better explanation of what the legal test units look like.

Posted by Assaf Arkin at

FWIW, Creative Commons recommends NOT using any CC license for software.  [link]

Which of course does not help answer the question of whether any particular CC license is compatible with any particular software license, but hopefully it makes the question occur less frequently.

Posted by Mike Linksvayer at

Mike: Since there are few cases where CC licensing is directly applied to what is indisputably software (and because such projects can usually be relicensed ASAP, ie. before derivative works are created by third parties), not much, actually.

It’s the edge cases like CSS stylesheets, markup templates, or icons incorporated into the UI where CC licenses are applied to what might be considered software under certain circumstances, or where CC content can be embedded or incorporated into, or aggregated with, software, that the problems arise.

Posted by Michael R. Bernstein at

[from mlinksva] Sam Ruby: Sausages and Uncertainty

[link]...

Excerpt from del.icio.us/network/gojomo at

I read your blog.

I didn’t just volunteer for something by writing that, did I?

Posted by Shelley at

Hi, I read your blog, but from Planet Intertwingly.

The scale image shows up as:

Scales of Justice. Based on work Copyright 2007 by Ken A L Coar. All rights reserved. The design and this SVG rendition are protected by copyright law, and may not be used or reproduced without the express permission of the author, coar@apache.org.

I see that it’s from someone in ASF, so they probably have allowed you to use it. Just a nice background for the lawerly post.

Posted by hdh at

Assaf, calling it IP is a great way of confusing things right from the start! :)

cf. Did You Say “Intellectual Property”? It’s a Seductive Mirage

Posted by Noah Slater at

Anarcho-Syndicalist Commune

Sam is doing the hard job, striving to find a workable framework, a firm vocabulary, but grows suspicious that the watery tart will have the last laugh....

Excerpt from Ascription is an Anathema to any Enthusiasm at

Interesting - in Google Reader, this post showed the contents of Ken Coar’s “description” element (which I’m sure was meant to be a “desc” element).

Posted by Jeff Schiller at

That’s why I inserted <![CDATA[...]]>, in the hopes that consumers which were following HTML’s rules would see that as a comment.  But, of course, some readers (including, apparently, Google’s) aren’t as careful as others with respect to following specifications.

I’ve gone ahead and changed <description> => <desc>, but I don’t have high hopes that it will make a difference.

Posted by Sam Ruby at

That’s why I inserted <![CDATA[...]]>, in the hopes that consumers which were following HTML’s rules would see that as a comment. But, of course, some readers (including, apparently, Google’s) aren’t as careful as others with respect to following specifications.

I’m curious: where exactly in the HTML spec is a CDATA section defined as a comment? I know that’s the convention, but is it actually documented in a real spec somewhere? Also, isn’t your feed using xhtml for the content type? I’m pretty sure that in XHTML, a CDATA section functions as it would in any other XML document - it’s just an escaping mechanism.

Which particular specifications did you think Google, and others, were not following?

Posted by James Holderness at

I’m curious: where exactly in the HTML spec is a CDATA section defined as a comment? I know that’s the convention, but is it actually documented in a real spec somewhere?

Do you consider HTML5 a real spec?  It is merely a working draft at this point, but it purports to describe HTML as actually practiced.  The documentation of tag open describes how CDATA is handled by HTML parsers (like IE, Gecko, Safari, ...).

Also, isn’t your feed using xhtml for the content type? I’m pretty sure that in XHTML, a CDATA section functions as it would in any other XML document - it’s just an escaping mechanism.

Yes.  My content renders correctly when served as application/xhtml.  It also would render acceptably if served asis as text/html.  But Google Reader neither serves this content as application/xhtml nor asis.

Posted by Sam Ruby at

Do you consider HTML5 a real spec?

I don’t pretend to know the ins and outs of W3C policy, but I would have thought that until HTML5 is ratified or whatever the term, it’s not really an official spec.

Now I agree that HTML parsers should be treating CDATA sections as comments, given that’s the de facto standard in use today. However, it seems overly harsh to accuse someone of not following specifications in that regard, when they’re actually doing exactly what the current spec says they should.

That said, I don’t think that’s what Google Reader is doing. I believe they’re interpreting your xhtml content as xhtml, as they should - they just don’t support svg. If you view this page in Firefox with svg disabled, it appears pretty much identically to how the feed entry appears in Google Reader. Are you saying that’s incorrect?

I would have thought it perfectly acceptable for an xhtml parser to interpret xhtml as xhtml even if it didn’t understand every namespace included in the page. Am I wrong in that assumption?

Posted by James Holderness at

I believe they’re interpreting your xhtml content as xhtml, as they should - they just don’t support svg.

I don’t believe that they are interpreting my XHTML content as XHTML, though I suspect that somewhere there is a semantic gap and you and I are using the same terms to mean different things.

I believe that they are parsing my content as XML, stripping out tags that aren’t white-listed, and serializing the result in a way that does not preserve the original CDATA designations, and then serving the resulting serialization as if it were HTML4 strict.

I believe that if they served the result as XHTML, preserving the CDATA delimiters is not important.  Alternately, if they were to preserve the CDATA delimiters and serve the result as text/html, we likely would not be having this discussion.

Posted by Sam Ruby at

I suspect that somewhere there is a semantic gap and you and I are using the same terms to mean different things.

Almost certainly.

I believe that they are parsing my content as XML, stripping out tags that aren’t white-listed, and serializing the result in a way that does not preserve the original CDATA designations

Take away the white-listing and that’s pretty much my definition of “interpreting your content as XHTML”. As I understand it, a CDATA section is merely a syntactic shortcut - it has no semantic value in XHTML - so preserving it serves no purpose. If replacing your CDATA sections with entity escaping changes the meaning of your content then surely it can’t be considered to be XHTML.

then serving the resulting serialization as if it were HTML4 strict.

They could serve the result as a single PNG rendering of your post and I would still consider them to be interpreting your content as XHTML if the rendering matched the output of a compliant XHTML web browser. And as I said before, their rendering looks to me almost identical to Firefox’s rendering of your web page.

So I ask again: is Firefox getting it wrong too? Or am I missing something?

Posted by James Holderness at

Take away the white-listing

Take away the content, and you have a blank page.  I’m not sure what you point here is.  They are doing white-listing.  And the white-listing along with the transformations that they perform and along with the substitution of mime type changes the rendering of the page.

They could serve the result as a single PNG rendering of your post

If they did the rendering, and served it with the correct mime type, I would agree.  But instead of rendering it themselves, they are doing something else, something that for lack of a better word I would call transforming.  The transform my content into another form, substitute a different MIME type and send it on its way, to be rendered by the user agent.  The combination of the transformation and MIME type substitution performed by Google Reader first ignores the indication that the content is XHTML, then it strips the <![CDATA[ ]]>, and the combination of the two causes data which would have been invisible to appear.

As near as I can tell, you seem to be hung up on the word ‘wrong’.  Fine.  I’ll use a different word then.  Taking the clearly specified intent of a post, one that is protected in multiple ways, and systematically subverting each and every one of them is somewhat... ‘unfortunate’.  Wouldn’t you agree?

Posted by Sam Ruby at

I’m not sure what you point here is.  They are doing white-listing.  And the white-listing along with the transformations that they perform and along with the substitution of mime type changes the rendering of the page.

My point is that it’s the white-listing that is the problem. The transformations and mime type changes are of no consequence.

The combination of the transformation and MIME type substitution performed by Google Reader first ignores the indication that the content is XHTML, then it strips the <![CDATA[ ]]>, and the combination of the two causes data which would have been invisible to appear.

If they preserved the mime type and left the CDATA section exactly as you had it, but continued to strip the svg elements that aren’t in their white-list, they’d get exactly the same result. You can prove this for yourself just by viewing your page in Firefox with svg disabled.

As near as I can tell, you seem to be hung up on the word ‘wrong’.

You implied that feed readers that didn’t render your post as you intended, were not following specs. Since I happen to have a feed reader that renders your post exactly as Google Reader is rendering it (assumedly not as you intended), I figured there was a specification somewhere that I wasn’t following. If I’m not doing anything ‘wrong’ then that can’t be the case, and I don’t need to worry. If I am doing something wrong, please point me to whatever specification I’m not following so I can correct my mistake.

Taking the clearly specified intent of a post, one that is protected in multiple ways, and systematically subverting each and every one of them is somewhat... ‘unfortunate’.

It’s unfortunate that your post doesn’t render as you intended it, but nobody is trying to subvert anything. It would seem that your intent wasn’t as clear as you thought.

Posted by James Holderness at

My point is that it’s the white-listing that is the problem.

I fully disagree.

If they had SIMPLY white-listed, AND retained the <![CDATA[ ]> markers, AND then sent the bytes along to be processed as innerHTML on a page which was transmitted with a content-type of text/html, then the description would have been treated like a comment.

But what they did was white-list and disregard the <![CDATA[ ]> markers, AND then sent the bytes along to be processed as innerHTML on a page which was transmitted with a content-type of text/html, which resulted in the description being displayed as content.

Posted by Sam Ruby at

I fully disagree.

I’m pretty sure we’re never going to agree on this - we seem to be arguing in circles. So let me ask you just one last question (hopefully just a simple yes or no). Do you think the way Firefox renders your content with svg disabled is ‘unfortunate’ or ‘wrong’ or 'not following specifications'?

If your answer to that is no, then I really don’t care what you think of Google Reader.

Posted by James Holderness at

I have no idea how to turn off SVG on Firefox, nor do I care to know.  But it would seem to me that would only be relevant if Google Reader were serving its content with a mime type of application/xhtml+xml, but it does not.

But to answer your question: XHTML (which Firefox supports but Google Reader doesn’t trigger by virtue of substituting a different file type) w/o SVG would mean that the desc would show.  I guess that means that you win.

Posted by Sam Ruby at

Add your comment