It’s just data

XML5

Anne van Kesteren: One of my side projects is XML5. Earlier this year I suggested the idea as XML 2.0, but in line with recent “jokes” about HTTP5, SVG5, and CSS5, XML5 makes perfect sense. The idea of XML5 is to provide a revision of XML 1.0, XML 1.1, Namespaces in XML 1.0, Namespaces in XML 1.1, and RFC 3023, that is backwards compatible and introduces HTML-like, although much more sane, error recovery.

Question: should XHTML5 be based on XML5 or XML1?

Warning: brainstorming ahead.  Don’t groan.

Supposed HTML5 defined not one, and not two, but three serializations.  The first one would be identified by the MIME type of text/html.  The second one by application/xhtml+xml.  The third one by a MIME type of text/html; subtype=xml.

General purpose browsers like Mozilla could support all three.  Special purpose browsers may choose to only support fewer parsers.  In particular, applications that require streaming support may chose to not implement the first type, and truly micro browsers may chose not to implement the first two (accepting some loss of fidelity in rendering some web pages).

Those that wished to include SVG or MathML inside of otherwise valid, but not well formed HTML pages, could do so with either a minor change to the MIME type or the addition of a <meta> tag.  IE would continue to ignore these elements, but at least authors wouldn’t have to do heroic acts in their .htaccess files any more.  And should IE ever wish to join the party, Microsoft would have the opt-in switch that they were looking for.


I definitely think there’s value to this.  Ignoring error handling for the moment, the two main things I’d like to see is dropping DTDs and forbidding named entities (even the built in ones like &lt;) — just force the use of numeric character references.

Posted by James Snell at

And would documents conformant to serializations 2,3 be identical (except for the MIME-type)?

That is, serializations 2,3 would differ only in their parsing model (2 being parsed by an XML 1.0-compliant parser, 3 being parsed by an XML5-compliant parser)?

That would be a big improvement on the mess that is Appendix C.

Posted by Jacques Distler at

And would documents conformant to serializations 2,3 be identical (except for the MIME-type)?

The devil’s in the details.  For example, &mdash; is legal is XHTML, but hasn’t proven to be very interoperable.  I would hope that serialization 2 would be explicit about which predefined entity names are valid — perhaps even to the empty set as James suggests (I personally would allow the ones allowed by XML: &amp;, &lt;, &gt;, &quot;, and &apos;).

Posted by Sam Ruby at

+1 for restricting to just the 5 predefined named entities in both serializations 2,3. (I realize that serialization 2, now known as XHTML5, doesn’t currently do that; it should.)

Posted by Jacques Distler at

My idea was to no longer have XML 1.0 / XML 1.1 basically. If we keep them there’s less of a win I think. Also, I would actually like to introduce a bunch of new predefined entities from HTML and MathML.

Posted by Anne van Kesteren at

My idea was to no longer have XML 1.0 / XML 1.1 basically.

You expect every consumer of XML (from Sam’s Venus to the XML libraries in my favourite programming language) to convert to XML5 parsing?

Wow! You do dream big.

Also, I would actually like to introduce a bunch of new predefined entities from HTML and MathML.

Predefined entities are a nightmare, unless you control both ends of the wire.

There are 2200 named entities in HTML+MathML (plus whatever “new” ones you wish to define). Are you actually going to require that clients which don’t do MathML actually support all those entities anyway? In light of distributed extensibility, why should &conint; (and its 2000 friends) be grandfathered in?

Posted by Jacques Distler at

My idea was to no longer have XML 1.0 / XML 1.1 basically.

It is one thing to say that a specific product, say Opera, would chose to treat application/xhtml+xml as XML5; but quite another to say that XML 1.0 (and what little XML 1.1 there is) would no longer exist.

If we keep them there’s less of a win I think.

The position as I understand it of the WHATWG has basically been that it can’t prevent an XML 1.0 serialization of HTML5, so it might as well define it.  That statement continues to be true.

I also believe that the current MIME type for application/xhtml+xml is a big impediment, second only to the well-formedness requirement.  Being able to gracefully degrade to text/html for recalcitrant browsers is worth doing.

Also, I would actually like to introduce a bunch of new predefined entities from HTML and MathML.

I share Jacques’s concern.  Basically at this moment, anybody who wishes to serve XHTML today and wants to work with Opera have already learned not to depend on any predefined entities beyond what is defined by XML.  Even &nbsp; is problematic.  But as long as the results are well defined and interoperable, I’m OK.

Posted by Sam Ruby at

Anyone remember UTF-8+names?

Posted by Aristotle Pagaltzis at

IMHO HTML5 should stick to XML1.

HTML5 already has error-proof XML-ish mode that’s just fine for all those authors who think they can generate well-formed XML with echo() without going insane.

if XML5 comes along, we’ll end up with yet another serialisaton that looks like XML1, but you can’t rely on it being compatible with XML1 parser (like real-world XHTML ended up).

Posted by kL at

It is basic that we read blog entry painstakingly. I am as of now done it and find this post is truly stunning. His Secret Obsession Review

Posted by Robinjack at

I truly delighted in perusing this post, enormous fan. Keep doing awesome me when would you be able to distribute more articles or where would I be able to peruse more on the subject? kalyan chart

Posted by Robinjack at

Good focuses you composed here..Great stuff...I think you’ve made some genuinely intriguing points.Keep up the great work.

Posted by small pdf tools at

Good way of telling, good post to take facts regarding my presentation subject matter, which i am going to deliver in my college

Posted by https://TheCaseSolutions.net at

I genuinely appreciated understanding it. Sitting tight for some more incredible articles like this from you in the nearing days

Posted by Finance Assignments Help Online at

This was really an interesting topic and I kinda agree with what you have mentioned here!

Posted by Questions to Ask a Guy at

Get best information about the best 3d printers under 500

Posted by A.hamza at

See this list of get the best sump pumps on our site.

Posted by A.hamza at

You can check the baby products here best convertible car seat 2018

Posted by Bina at

You can get daily latest crypto currency news daily latest bitcoin news from us.

Posted by Daily cryptocurrency news at

Great idea. I really appreciate it.
You can get more from here daily ripple news.

Posted by Dailycryptonews at

Find a Best 3D Printer Under 500 here.

Posted by adrew at

good post !!!

Posted by anthony pual at

I truly delighted in perusing this post, enormous fan. Keep doing awesome me when would you be able to distribute more articles or where would I be able to peruse more on the subject?

Posted by Project IGI Game at

I truly delighted in perusing this post, enormous fan. Keep doing awesome me when would you be able IGI 3

Posted by Project IGI Game at

Thanks for every other informative blog. The place
else may just I am getting that kind of information written in such a perfect way.
Keep it up.

Posted by find chainsaw at

Get best information

Posted by Smoothie King Menu at

There is noticeably a bundle to know about this.<a href="http://www.melbournelasertattooremoval.net.au">cheap tattoo removal Melbourne</a> I assume you made certain nice points in features also.

Posted by tattoo removal melbourne cost at

Add your comment