UserPreferences

MultipleContentDiscussion


See also content, EchoExample, MimeContent, ComponentBlog.


Multiple Content Elements

[MarkHershberger] In MimeContent, the case is made for a single <content> element and using multipart/alternative or multipart/related to include different sections. I like the idea since having multiple <content> elements is confusing, but then feed readers have to know our funky XML translation of MIME. Bottom line: multiple <content>'s are ambiguious.

From EchoExample and content, the comments below refer to this example:

  <content type="multipart/alternative"> 
    <content type="image/jpeg" encoding="base64"> 
      xo+Hello0AFWeblogh5FWorldh1mImagedsTbrVbF3 
    </content> 
    <content type="text/html" xml:lang="en-us" mode="escaped" rel="fragment"> 
      <![CDATA[<p>Hello, <em>weblog</em> world! 2 &lt; 4!</p>]]> 
    </content> 
    <content type="application/xhtml+xml" xml:lang="en-us" rel="fragment"> 
      <p xmlns="http://www.w3.org/1999/xhtml"> 
        Hello, <em>weblog</em> world! 2 &lt; 4! 
      </p> 
    </content> 
    <content type="application/pdf" src="http://example.org/blog/hello.pdf" /> 
  </content> 

[JamesAylett RefactorOk] Having a single containing <content> element and then using the methodology of multipart/* is fine, but please, please, we should not be saying that the <content> element contains content "of MIME type multipart/alternate" (or multipart/anything) unless it is precisely in that MIME format.

[KenMacLeod] Very good point. It's clear that the intent is to use the methodology, and you are correct that those MIME types (multipart/*) already have a serialization.

Possible solutions:

[JamesAylett RefactorOk] I would argue that by far the best way of doing this is to not have any direct support in !Atom, but simply to note that "an XML serialisation of the concepts of MIME multipart would be a convenient mechanism for multiple content entities or representations". The best way of doing this would be to introduce a new media type, which I'd argue should probably be application/x.multipart+xml rather than multipart+xml/* (because a whole new major type will take ages to standardise). However this is really beyond what !Atom should be defining, particularly because if it's done right, it's applicable far beyond this field. So:


[MarkNottingham] The current model allows for 1+ (possibly 0+) content modules. What does it mean to have multiple content modules in the same entry? Are they expected to be semantically equivalent (e.g., HTML vs. plain text of the same content)? If they're different, what does that mean? Each one has a media type; are content modules required to have distinguished types in that domain? Hmm. They have identifiers, and media types... seems to me you might call them, oh, I don't know... maybe Representations?

[JamesSnell] Yes, the requirement should be that all ContentModules in a given WellFormedEntry MUST be semantically equivalent.

[MarkNottingham] This gets especially interesting when you mix in the discussion re: language tags. It seems like this might be moving into the realm of a portable Web representation format, which has already been hinted at in [WWW]PASWA (not that I'm *very* happy with that part of PASWSA) as well as in Graham Klyne's [WWW]XMLization of MIME messages. [WWW]related MIME discussion

If this is accepted, it means that we should be considering this stuff with a RESTifarian hat on (I know some will object, but hey - it works). If that's the case, we should probably allow/force each content module to specify not only a media type, but also a (base) language, optional encoding (e.g., "This content is base64-encoded"), and so forth.

Taking this to its (possibly) logical conclusion, that would leave us with a model like:

Entry (Resource)

The cool part is where you can substitute a URI for the entity-body (just like in PASWA) and fetch it remotely, instead of shipping it around with the representation. That way, the metadata and content uses of the model are united.

[JamesSnell] Ok, it's 11:30 at night and I'm just not sure I'm groking this, but, if it mean what me think it mean then me think it good but me not sure if it really mean what me think it mean. In any case, a single WellFormedEntry should be capable of containing or referencing multiple semantically equivalent representations of it's content. Each ContentModule is a unique entity with it's own UniqueIdentity.

[MarkNottingham] I don't think that's the point. What I'm leaning towards is something where an entry looks like (if you'll excuse the crassness of a serialization, this is just thinking out loud):

<item uri="http://example.com/items/54">
  <creationDate>2003-06-12</creationDate>
  <content type="text/html" xml:lang="en" title="Stuff"> <h:p xmlns:h="..">This is the content</h:p> </content>
  <content type="text/plain"> .. </content>
</item>
OR it could look like
<item uri="http://example.com/items/54">
  <creationDate>2003-06-12</creationDate>
  <content type="text/html" xml:lang="en" title="Stuff"> <include href="http://example.com/items/54.html" /> </content>
  <content type="text/plain"> .. </content>
</item>
(There are, of course, lots of ways to serialize this, so don't get hung up on that now.)

So, we could require at least one "content module" (not really sure about that name), but the actual content might be there by reference, not by value. That way, you can have metadata-based entries (e.g., news RSS), or you can have content-based entries (e.g., weblog RSS).

[TimothyAppnel] How is the actual content as a reference different then a permalink assuming you have one ContentModule? This is an area where turning conceptual data model into practice seems rather fuzzy to me.

[MarkNottingham] That's a good point. content modules contain syndicated content, and they ususally don't have the context of the original (e.g., ads, Web site navigation, style, etc.); the permaLink is to the original (the source of the syndication) that does contain this stuff. I think that's an important distinction to make, and preserve in the model.

[PhilWolff] If I understand you correctly, you are describing alternate representations, not really multiple content. The syndication equivalent of the img alt tag. Is that what was intended? Assistance with transcoding and translation?

[AsbjornUlsberg, RefactorOk] Whow, I think you all are mixing things together here. :) First, there is a big difference between <content> and <feed>. A "permalink" as I understand it, goes to a <feed>, and I can't see that we have discussed nor come to any concensus to wether <content> should have permalinks or not, nor what a permalink in that context is.

Each feed should be retrievable through some kind of URI. Let's call this permalink. This URI should point to a resource returning the exact same data as you are looking at, e.g. the Echo feed. Other representations of the feed (HTML etc) should have alternative <link> elements pointing to it, or some other kind of external referencing method.

Each <content> can be retrievable via an URI, but doesn't have to. The reason why <content> should be retrievable over web protocols like HTTP, is because <content> can be a big CD image, ZIP archive, a downloadable database, etc. Such content is just stupid to have inline the Echo feed, unless you have to because of firewall problems and such.

So, <content> should in most cases contain the content inline, but in many cases it's much better to have it by reference. We need support for both methods, and each method serves different needs. They neither exclude nor replace one another; every type of content can be inline and externally referred. I hope I'm not alone in this view.


CategoryMetadata, CategoryModel