PaceSimpleContentType

Abstract

Given a separate technique for creating and managing arbitrary content types (PaceSimpleResourcePosting, PaceNonEntryResources), the opportunity arises to vastly simplify the remaining inline content cases to only XML text (markup and/or characters) and escaped HTML markup, and content-by-reference for specifying full body content using an arbitrary Internet media type. This proposal removes the @type attribute of "Content Constructs" and removes the 'base64' mode (replaced by content-by-reference). The 'escaped' mode is replaced by 'escaped-html', as that is the only backwards compatible use for escaped markup.

PaceContentSrc is an alternative that provides a "src" attribute for content-by-reference for all Content constructs, where this Pace only allows a "src" attribute for full body content and alternates.

Status

Open.

Author: KenMacLeod

Revised: 29-Jul-2004, 15-Jul-2004, 8-Jun-2004

PaceSimpleResourcePosting -- how to create related or arbitrary, non-entry resources
PaceNonEntryResources -- how to create and manage related or arbitrary, non-entry resources

Rationale

The "unbounded openness" of allowing arbitrary MIME content types in Content Constructs has always been strongly debated. One of the principle use cases for this unbounded openness was the ability to create and manage "complex" and multipart entries. Recently, several proposals have come forward (PaceSimpleResourcePosting, PaceNonEntryResources) to handle complex, multipart entries in a more robust and direct manner. Therefore, the content models of the <name>, <title>, <tagline>, <copyright>, <info>, <summary>, and <content> can be reduced to just the common cases of escaped HTML or XML text (markup and/or characters).

Further, the Content construct of the format has never been thoroughly specified. It does not specify what the interpretation, or profile, of an arbitrary media type resource should be, particularly the definition of a "resource" or "payload", whether the mode attribute is base64 or not. The Content construct, and the atom:content element in particular, never specified the informative elements originally found in content (internal wiki link currently broken).

This proposal allows content-by-reference only for the atom:content element ("src" and "type"), and alternatives for that content using a new atom:content-alternate element.

One other use case for multipart/alternative content or allowing multiple <content> elements is multiple language content. multipart/alternative was only ever specced for <content> and not other fields, which tends to make this case not applicable. This proposal relies on the user creating unique entries and using xml:lang for multiple languages.

Why is <content-alternate> required when alternate textual content could be inline in the <content> element?

<content-alternate> is required to support multiple alternate textual or non-textual alternatives. Since <content-alternate> is required for that purpose, it seems simpler to exclude the combined use than to describe the simultanous use of <content> with @src and with element content and still with <content-alternate>s.

Proposal

(1) Replace section 3.1 Content Constructs with the following:

xml

A mode attribute with the value "xml" indicates that the element's content is inline XML text (for example, a series of XML characters without any non-character markup, namespace-qualified XHTML, or XML text from other XML namespaces).

Examples:

  <title>Ben &amp; Jerry&apos;s, yumm! :-&gt;</title>

  <content mode="xml">
    <div xmlns="http://www.w3.org/1999/xhtml">
      Here's some <em>important</em> mathematics:

      <math xmlns="http://www.w3.org/1998/Math/MathML">
        <mrow>
          <mi>4</mi>
          <mo>&gt;</mo>
          <mi>3</mi>
        </mrow>
      </math>
    </div>
  </content>

escaped-html

A mode attribute with the value "escaped-html" indicates that the element's content is an escaped string of HTML markup; the version of HTML is undefined. The string is passed to an HTML processor to be rendered.

Examples:

  <title mode="escaped-html">Ben &amp;amp; Jerry&amp;apos;s, yumm! :-&amp;gt;</title>

  <content mode="escaped-html">
    Here's some &lt;em&gt;important&lt;/em&gt; mathematics:
    4 &amp;gt; 3.
  </content>

(2) Replace section 4.13.10 "atom:content" Element with the following:

Example:

  <content src="kitty.jpg" type="image/jpeg" />

(3) Add a new section 4.13.XX "atom:content-alternate" Element with the following:

Example:

  <content src="kitty.tiff" type="image/tiff"/>
  <content-alternate src="kitty.jpg" type="image/jpeg"/>
  <content-alternate>White calico with light and dark tans resting on a couch pillow.</content-alternate>
  <content-alternate><surface-texture xmlns="http://example.com/surface/ns#>...</surface-texture></content-alternate>

(4) In each Content construct Atom element, specify whether it uses "inline", "paragraph", or "block" content.

atom:title -- inline
atom:tagline -- inline
atom:copyright -- inline
atom:info -- inline or paragraph (specify)
atom:summary -- paragraph or block (specify)
atom:content -- block

(5) Add new section, Content Profiles, and subsections as below:

XML Characters (plain text)

XML 3.3.3

XHTML

http://www.w3.org/1999/xhtml

[guidance?]

Also: rendering issues, DOCTYPE, quirks, css, charsets, must ignore.

XML

Future specifications or general practice are expected to profile usage of XML qualified in other namespaces. As a good practice, XML content SHOULD consist of one namespace-qualified element. Consumers MUST be able to accept mixed-content, including content where XML characters precede any start-tags or empty-element tags or follow any empty-element tags or end-tags. Processing of character content that is outside of any namespaced element is undefined. User agents that encounter XML namespaces that are not renderable must display the document in such a way that it is obvious to the user that normal rendering has not taken place.

Escaped HTML

[guidance?]

Impacts

This proposal deprecates the @type attribute (it can be ignored by processors).

The @mode value of "base64" is dropped. The extent of the usage of this mode is unknown. The TypePad Atom implementation uses base64 entries for uploading photos, which this Pace presumes will be superceded by PaceSimpleResourcePosting / PaceNonEntryResources.

The @mode value of "escaped" is changed to "escaped-html" as a cue to users. During transition, consumers receiving "escaped" should treat it as "escaped-html".

Non-text content that previously may have been found in <content) (extent unknown) is moved to a <link> construct.

Extensibility

The current specification has several dimensions of extensibility (content type, mode of encoding, partial content, XML namespaces, multipart content) that contribute to its complexity.

This proposal reduces the extensibility to two areas:

The XML namespace of the atom:content element content. The Atom specification will provide a profile of XHTML that Atom implementations must support, while leaving open the ability to support other XML namespaces with additional profiles. Because XML namespaces do not require a central registrar, there is no need to register either the namespaces or the profiles. On the other hand, developers will benefit from having some technique (ie. RDDL at the namespace URL) or central location from which to find profiles and processing guidelines for use of other XML namespace qualified content within Atom.
Alternate content uses the <link> element, Internet Media Types, and transfer modes of relevant href scheme identifier (MIME or HTTP).

Notes

Changes from 15-Jul-2004:

Noted that some Atom elements, like atom:title or atom:summary, may restrict content to inline or a single block, like a paragraph.

Changes from 8-Jun-2004 (3-Jul-2004) (diff):

Reference updated Paces for non-entry resource uploading.
Replace link/@rel="content" with content/@src, which seems to be the favored solution for content-by-reference.
Separate <content> and <content-alternate> so that the most faithful representation can be indicated.

Changes from original version (diff):

added use-cases to Rationale
replaced fixed media types to a fixed selector of XML text or escaped HTML markup.
added specification on how to access alternate non-XML text content via <link>
added Extensibility section

CategoryProposals