It’s just data

Mime Sniff

Barth & Hickson: The above algorithm is a willful violation of the HTTP specification

Mark Pilgrim was unavailable for comment.


submitted by gthank [link] [0 comments]...

Excerpt from programming: what's new online at

That same text is in the HTML5 spec, bottom of 2.7.1: [link]

Posted by Ian Hickson at

I’m familiar with that document.  And, yes, I do remember seeing those words there before.

However, these words now find themselves is an entirely new context.  That’s noteworthy.  At least to me.

Ideally, that ID would become an RFC, and the duplication in HTML5 would be replaced by a reference.

Given what I in know about the IETF, and in particular how they evaluate consensus, I must say that I have serious doubts that that would ever happen.

Posted by Sam Ruby at

I’m curious, Sam, how you would feel about a future revision of the Atom Syndication Format including a section on sniffing the type of the title element (as one example), if it were shown that feed producers were frequently supplying incorrect values for the type attribute.

I don’t doubt that such information might be useful to implementers, but I’m not so sure that it belongs in a spec.

Posted by James Holderness at

Given what I in know about the IETF, and in particular how they evaluate consensus, I must say that I have serious doubts that that would ever happen.

It would probably be hard to get consensus for the standards track. But tf the RFC editor will publish things like RFC 4559, I don’t see what prevents this from becoming an informational RFC. Information wants to be free!

Posted by anonymous at

James: I would not be happy with such a thing.  And while I’m not exactly thrilled with this ID, I understand its motivation, and see the case as being subtly different.  And I do mean subtle.  As the author of a tool which consumes feeds, I presume that you routinely ignore RFC 3023.  You are routinely faced with conflicting information, and advice the existing RFCs provide are a bit... suboptimal.

A friend of mine, and now a colleague of Ian’s, once observed that (and I’m paraphrasing here), “it is almost like the authors of the various RFCs all had root access to their web servers”.  My take on the same phenomenon can be found here.

As to dealing with problematic titles, my preference is for tools to provide the ability to provide per feed overrides.

Anonymous: it is a judgment call, but my experience is that the IETF will only endorse informational RFCs that they deem to be “non controversial”.  Even with some creative labeling, I have doubts that this will pass that bar.

Posted by Sam Ruby at

I presume that you routinely ignore RFC 3023

Indeed. I also routinely ignore the conformance requirements of the XML 1.0 specification. But I wouldn’t dream of proposing an XML 1.x specification that included normative parsing algorithms for handling mismatched tags and undefined entities.

It seems to me that a lot of what is currently being proposed for the HTML5 spec belongs in an implementation guide, or a best practices document. The information is useful, but doesn’t belong in the spec.

IMHO. I’m not seriously trying to convince anyone of anything; just expressing my thoughts in passing.

As to dealing with problematic titles, my preference is for tools to provide the ability to provide per feed overrides.

That’s a cool solution for Venus, but I don’t think that sort of thing would work for my target audience (that includes me). I couldn’t be bothered fiddling with settings every time I came across a broken feed. I want my reader to be smart enough to fix the problems for me.

Posted by James Holderness at

what is currently being proposed for the HTML5 spec belongs in an implementation guide, or a best practices document. The information is useful, but doesn’t belong in the spec.

Those are examples of what I was referring to above by “creative labelling”.  I’m still profoundly sceptical that even such approaches would fly in the IETF.

Posted by Sam Ruby at

James: “the HTML5 spec belongs in an implementation guide”

It is an implementation guide.

Posted by Bill de hÓra at

It’s not just about creative labelling. The problem is in the way the data is presented. Both this ID and HTML5 in general read as normative specifications, not informational documents.

Informational: Version X of Internet Explorer, upon encountering the byte sequence ABCD, will interpret the document as having a content-type of foo/bar.
Normative: All HTTP processors, upon encountering the byte sequence ABCD, must interpret the document as having a content-type of foo/bar.

The former information I’d find extremely useful. The latter is just another specification I’d routinely ignore.

Posted by James Holderness at

James: point taken.  And I’ll go further, and modify my previous position: an approach such as the one that you describe conceivably could have a chance as an informational RFC in the IETF.  It would not by any means be a slam dunk, but might have a chance.

As to whether Barth and Hickson would be open to such an idea, I can not say.

Posted by Sam Ruby at

Mark Pilgrim, Now Reality-Encumbered

Sam Ruby: “Mark Pilgrim was unavailable for comment.” Mark Pilgrim: “There is no right or wrong. There is only what works and what doesn’t.”...

Excerpt from Rob Sayre's Mozilla Blog at

Shrinking HTML5

Splitting out sections of HTML5....

Excerpt from Anne’s Weblog at

I guess I don’t understand the point of a spec if it’s going to be ignored. Why would we not want all specifications to be implementation guides? Isn’t that the point of having a spec?

I agree that taking this along the RFC standards track is going to be... interesting.

Posted by Ian Hickson at

Point to ponder: what’s the point of the IETF if standards minted there are permitted to willfully ignore each other?

I believe that the preferred process in this case would be to revise HTTP instead.  As luck would have it... now might be a good time to do exactly that.

Posted by Sam Ruby at

Sam,

could you elaborate exactly which part of HTTP you think should be revised?

Best regards, Julian

Posted by Julian Reschke at

Julian:

First, the cop-out answer: I don’t believe that the IETF should provide two different standards, with one prereqing and relying on the other where it is not possible to implement both simultaneously correctly.

Now to directly answer your question: I believe that Authoritative Metadata needs to be revisited.  It attempts to resolve conflicts with simple precedence rules, getting the rules pretty much backwards.  Meanwhile RFC 3023 needs to either be radically revised or withdrawn.

Simply put, conflicting metadata should be an error.  Recovery should be left up to the application layer.  Yes, that causes problems.  But nothing like the problems the alternative presents, particularly given that applications and users either can’t or won’t implement the standards as spec’ed.

Between the two answers I provided above, I must say that the first fascinates me more.  As awesome as Ian is, I don’t believe he can do everything at once, though I secretly hope he proves me wrong.  Some thing must give, even if only a little.

Posted by Sam Ruby at

That comment from Ian Hickson you link to above is amusing — it asserts that committees of companies are evil, yet dictatorship is good? grins

Posted by Laurens Holst at

Julian,

a way to do recognize that mime type sniffing is needed in some cases, would be to create class of products for different type of products and contexts. So people doing the right thing (in their own business context) can continue to do the right thing and others have a way to recover and deal with issues.

Though I would go further than just error recovery. Error recovery without a mechanism for reporting is doomed to just enter in a spiral. An ecosystem can be improved if there exists reporting system in place. The secondary issue with reporting system if not done appropriately is abuse (spam, DOS, etc.)

Posted by karl at

Karl,

understood. What I was trying to understand was whether there’s an expectation of changes in HTTPbis. After all, RFC2616 only defines Content-Type, but doesn’t specify any requirements on the recipient.

The only thing in RFC2616 I can currently think of which is related is the character set defaulting for text types. This is very hard to change, and already has an associated issue: Default charsets for text media types.

Posted by Julian Reschke at

Add your comment