UserPreferences

TimestampVsCreationDateTime


What should the timestamp be?

Edit

<issued>


TimestampVsCreationDateTime

CategoryElementSpec

Edit

<modified>


CategoryElementSpec

Which times should be required? You may vote for more than one. [OpenPoll]
1. When the post was created (creation-date, dc:created): DavidYule
2. When the user says the post was published (post-date, dc:issued): JoeGregorio, TimothyAppnel, MishaDynin, AaronSw, DannyAyers, MarkPilgrim, GaryF, JonathanBroad, MarcusCampbell, GrantCarpenter, DiegoDoval, BryantDurrell, TomasJogin, StanFinley, JakeSutton, AliAghareza, JeremyGray (dc:issued), MaciejCeglowski, ArveBersvendsen, ChrisWilper, AdamRice (dc:issued), ZhangYining
3. When the post was last modified (last-modified-date, dc:modified): TimBray, MishaDynin, DannyAyers, KenCoar, ChrisWilper, DiegoDoval, BryantDurrell (but see comments re: defaulting to post-date, below), GrantCarpenter, StanFinley, JakeSutton, AliAghareza, JeremyGray (dc:modified), MaciejCeglowski (dc:modified), ZhangYining
4. Adopt Dublin Core dates directly: KenMacLeod, AsbjornUlsberg
5. None (Entries should not require a timestamp): AllenFirstenberg, ChrisLawrence, HansGerwitz (consumers can use inherited timestamps from the feed for entries if needed, so the feed must specify at least one time)

Discussion summary: The first three choices, creation-date, post-date and last-modified-date, will be available for producers to add to their entry. The voting is just to decide what is the required date in every entry. Whatever isn't choosen will still be available as an optional information element. "Adopt Dublin Core dates directly" means to use the existing DublinCore date definitions.

Informational

Both Blogger and Movable Type use 'post-date' (set by default to 'creation-date') on their HTML pages.

Pros and cons of last-modified-date:

Prior Art:

Discussion

[SimonWillison] - I find the word "issued" slightly confusing. Is there any reason not to use "published" instead?


[AaronSw, RefactorOk] I've withdrawn my support for last-modified-date. I don't see how we can require a field that one of the major weblogging tools (Movable Type) does not support by default. Doing so will make adoption of the format very difficult. However, we should strongly encourage support.

[BillKearney, RefactorOk] Both MovableType and LiveJournal internally track both creation and modification timestamps. MT has a [WWW]lastmodified plug-in and LJ seems to be able to expose the [WWW]properties. The data is present so it's not like it can't be considered. Likewise RadioUserland directly tracks creation date and modification date can be extracted. What tools, if any, don't support both creation and modification timestamps? If they all support them then it's not unreasonable to consider requiring these values. If they're missing, compelling arguments might encourage the developers to implement them.

[TimothyAppnel] This does need clarification. Considering an external model I think the timestamp is/should be the post date. Creation date (when did I first enter this into the system?) is more of an internal concern that if communicated should be part of an optimal module -- namely the Publishing module.

[TimBray] The paragraph above beginning "2 is most useful" puzzles me; surely the last-mod date on HTML pages that the server gives you is 3, not 2. At [WWW]ongoing each post has a creation date, which is when I started writing it, and an updated date, which is the last time I posted it. The creation date can never change. Updated changes every time I correct/change/enhance an entry. It seems to me the updated-date (3 in the list above) is the fundamental thing that should be in the basic log entry data model, but maybe I'm missing something.

[TimothyAppnel] There are a number of reasons for keeping a different date other then created and last modified which is 2 . I suppose it comes down to editorial control. Its kind of like the date on a magazine cover or newspaper. They are published a day to months before the date on the cover. A simple practical way in which I would not want created and last modified dates is when I import a batch of entries from another system. Using last mod date on the HTML (3) is problematic because if I where to modify the template and nothing else. It sets off a false positive. Furthermore all of my entries will have the same date. An aggregator trying to sort of merge entries can't do that anymore. Its reasons like these that systems such as MT and Blogger keep this 3rd user controlable date.

[MishaDynin, RefactorOk] If I had to pick one, I would choose PostDate; but I don't see any problem in requiring both LastModified and PostDate.

[JoeGregorio] Misha, I recovered your comment from the archive, someone deleted it in a reshuffle earlier today. I brought it back because I think it is very relavent to the discussion. To me it is a clear example of three different kinds of dates you could associate with a post.

This may be related to the discussion of notification at SiteAndSyndication or of version tracking in BiblioGraphy.

[ShelleyPowers] Interesting that you're saying what the timestamp should be, rather than what it's defined to be. This could end up being an interpretation of how weblogging tools should work, rather than an understanding of the weblogging data model to be used for interoperability.

[TedLeung] It seems to me that post date is the thing to be used for interoperability, since (as Misha has noted) post date is the thing that changes the observable behavior of a weblog.

[MishaDynin, RefactorOk] TimBray's comment made me rethink my position. Last-modified timestamp is important for programs -- for example, caching systems can use last-modified dates as version numbers. And I can't think of a situation when providing last-modified dates is a burden. Post date (or publication-timestamp) specify the ordering humans care about most; last-modified specify the ordering computers care about most. Both should be required.

Last-modified time stamp allows you to decide which of the two distinct versions of an entry is more recent. This is a common enough problem that I think we should build in a way to solve it.

[TimothyAppnel] I agree with Tim Bray that version number should be left up to the system and/or publisher. Some application may find last-modified date insufficient -- perhaps they'd like to afix a Digest hash of some type. I don't think post-last-modified, while admitedly a useful piece of information, not all uses of a well formed entry will need it or be capable of keeping meaningful information. The example of a caching system and last modified date seems more appropriate at the HTTP level. In keeping the core simple I'd suggest it not be required -- this piece of information can be included using the BiblioGraphy module.

[MishaDynin] Can you give an example of a system that would be incapable of providing meaningful last-modified information? HTTP-level last-modified is not as useful because multiple entries can be transmitted in one HTTP request.

[AaronSw, RefactorOk] And because template changes will cause last-modified to change.

[TimothyAppnel] TimBray has mentioned the weblog tool he spun only has the HTTP last-modified data. Blosxom users would have a similar issue. I believe if I where transfering my entries from another system (or implementation) to MT it would reset the post-last-modified date to the import date. MT does not give you direct access to the last-modified date to correct this. (This is without checking, but I'm fairly sure this is the case.)

[JonathanBroad, RefactorOk] As a librarian, I've got to vote for number 2. Although I understand the attractiveness of last-modified, it doesn't seem like the right hook on which to hang the temporal aspect of a posted content object. The post-date, on the other hand, fixates a post at a given more-or-less unchanging moment in time that the author of the post has taken responsibility for. Think of it as another kind of identifier. Of course, last-modified is also incredibly useful information and should be widely supported. Creation-date seems kinda immaterial.

[KenCoar, RefactorOk] Since this is about which item should be required, I think the last-modified date is the most important. How many entries actually get edited after posting? I suspect it's a single-digit percentage or less. For all the others, the last-modified date and the post-date will be equivalent, so there's no loss there, but caching clients and others capable of distinguishing will be able to do so. Which is more important to the end-user? Knowing when a modified post first appeared, or knowing that it has changed and should probably be revisited? And I don't quite get AaronSw's comment about template changes making a difference. Maybe it's because I don't use a template system, but aren't templates a rendering feature? What effect would they have on this meta-representation?

[ChrisWilper, RefactorOk] I vote for last-modified at a minimum, and would strongly recommend an optional publish-timestamp. A tool obviously needs to notify the user when something is new. Well, without any dates at all, it can be done, but it's not reliable. For instance, it can ask: Have I ever seen this item (identified by uri) before? No, ok, it's new. But with last-modified-timestamp, changes to an existing entry can be reliably determined (as opposed to comparing the contents of the cached entry, which is a brittle method). This seems like a basic minimum of functionality to support. If publish-timestamp is available, that's a nice piece of information for the aggregator to use (for example to sort things according to the user's prefs), but requiring it doesn't seem necessary. Update: I support either both or last-modified.

[KenCoar, RefactorOk] Also, I don't get the complaints about 'existing tools won't support this-or-that. Unless I'm mistaken, the tools are going to need to be modified to work with this format anyway, since [WWW]it isn't RSS. Ne-c'est pas?

[DannyAyers, RefactorOk] Très vrai! Movable Type does not yet support any of the Echo format. Six Apart does however support the initiative - if there is consensus that this feature is required, then there is little doubt that they will implement it. I agree with Shelley - we should be looking at the model for best practice on how we should define this, rather than trying to follow current, possibly flawed approaches. HTTP on its own is no real use here. If a caching proxy is involved, then the client agent should be informed and interpret dates accordingly - this isn't an issue for the format. Although I think the timestamp terms should be defined as part of Echo, I also believe it would be worth defining them with reference to Dublin Core to help remove ambiguity from the spec and allow wider interoperability at no extra cost. (Copied from a [WWW]comment at Sam's) The UTC time refers to the moment, the timezone is really about geography. For 99+% of the time its value will remain the same for a given feed, why bother wasting bandwidth? I'd suggest it goes in with an (optional) extension like GeoURL. The local time can be calculated by any agent that considers it significant. re. format : the W3CDTF of ISO 8601 has a lot going for it - i18n, standardization, ease of reading/writing, and it sorts well. PS. I just re-read the [WWW]W3CDTF spec and using this would in effect give a UTC time with the timezone offset being optional (which sounds about right to me), e.g.

[BryantDurrell, RefactorOk] Ken Coar: there are varying degrees of modification. Using MT as an example, generating a new template is relatively easy (defined as "does not require touching perl code") as long as you don't require any data that isn't provided by the existing MT template tags. The post-date is available as an MT template tag; the other proposed dates are not.

[JonathanBroad, RefactorOk] Although determining changedness via comparing an entry to a cached copy is a brittle method, it is feasible. There is no way to determine publish-date retroactively, which is precisely why I think that if a choice needs to be made the post-date is the most important metadata. It places the entry in its original context, allowing for an elemental kind of sequencing. Additionally, for most uses I can think of it's more important that a tool know if something has changed than when. Otherwise, you might as well specify a versioning module (a series of whens) and give the whole story, right?

[StanFinley, RefactorOk] I think we have to ask ourselves if the required timestamp for a log entry should be representative of the original time the inception of the idea or concept of the post occurred. Philosophically it makes more sense to me that the timestamp implies that I had some particular idea or concept I wished to convey at some particular time and therefore my log entry reflects that time. If I revise my entry, I may have merely corrected a spelling error. A further consideration is the question of whether a new timestamp representing my revised log entry should re-ping any trackback like connection to other external log entries and if it does, should those external logs overwrite my previous entry or add a new trackback which displays my revised entry at revision time. If so, my revisions display on the other log in reverse chronological order. This becomes redundant so this question probably needs to be addressed, perhaps in another section or category of this wiki.

[AlisonWheeler, RefactorOk] I'd have to second this - there are often times on shared journals where the date and time stamp of who got there first (prior art, if you will) is needed. The others are still useful but when it first reached the net is an important one that shouldn't get thrown out.

[MishaDynin, RefactorOk] RoadMap says the tool developers are on board. Besides adoption concerns, are there any good arguments against requiring issued (post date) and modified timestamps (as specified in DublinCore), with optional created?

[GrantCarpenter, RefactorOk] If last-modified-date is the lowest common required denominator, downstream tools will have to default to displaying items in modification order not creation order. This seems undesirable since the most commonly desired display order would be creation order (post-date). Still the same, being able to simultaneously maintain creation order while displaying modification date is desirable, imo--if only one can be included for simplicity's sake then post-date is my vote. Personally I could see making both required--it seems reasonbly trivial on the tools site (defaulting to last-modified-date equals post-date to external agents if separate values aren't internally maintained).

[DiegoDoval] I agree with both Misha Dynin and Chris Wilper. Both post-date and modification-date should be required. The publication date is important for people, the last-modified date is important for programs. For something like a newsreader, having the last-modified date plus the URI would be enough to determine whether an item has been changed, and the tool can deal with it accordingly (e.g., ignore it, replace the original, show change history, etc).

[BryantDurrell, RefactorOk] Is it possible to require post-date and specify that last-modified is assumed to be the same as post-date unless there's a separate last-modified date present? Justification: the fewer elements, the easier it is to implement and the less likely it is that someone will trip up. Consequences of error here are not great, but still -- I think KISS applies.

[Manuzhai, RefactorOk] That sounds like a nice idea. I support having both post-date and last-modified-date as required: you need the post-date for ordering the entries, and you'd use the last-modified-date in determining if an entry has changed (maybe you need to rebuild some sort of cache). In order to keep it simple, you could assume last-modified-date to be equal to post-date if not given. It intuitively makes sense, it saves bandwidth, but it adds a bit of complexity.

[AliAghareza, RefactorOk] I think both post-date and last-modified should be required. If I considered a change to an entry only as a correction, then I would say only post-date should be required, but why only make corrections. If I wanted to have a single post (that might map to a single idea that I have) that I keep adding onto, shouldn't my entry take into account for it's evolution with some kind of modification-date.

[JeremyGray, RefactorOk] I would have voted for #4, but can't see creation date as having useful relevance to echo. As for dc: vs. non-dc:, I'd rather see echo leverage DublinCore where possible. Additionally, I would suggest that Zulu timestamps be suggested or perhaps even required as they simplify searching and sorting, remove issues regarding regional differences between changes to and from daylight savings time within a given time zone, and can easily be converted to and from a given user's local time when required.

[ArveBersvendsen] I believe that the only required date should be dc:issued, but that the spec should allow for the full range of DublinCore dates to be used: That way, the different tools will all have room to evolve, while existing tools will still be able to do the job.

[AllenFirstenberg, RefactorOk] Several of the UseCases have examples where the entries (as opposed to the feed as a whole) do not require a date. RSS today does not require that the Items have a date, and there are feeds that make use of this. Imposing a date limits how we can use an entry.

[DavidYule] Surely the only thing certain about a log entry is that it has been created. It may not have been published (e.g. draft posts), and it may not have been modified. I would also think that the creation date is the most important date for the creator - publication date and modification date are things you might want your tool to support, but they're not fundamental. Seems a bit backward to mandate Publication/Modification dates to documents where neither might apply, while making Creation date optional when it must have happened :)

[AdamRice] I am in favor of treating the original date of publication as an entry's "canonical" date. It is important to be able to track revisions, but in a feed using only modified dates (speaking as a heavy use of NetNewsWire) I would not like to see entries in feeds juggling their sequence if an author corrects some misspellings after posting. We could also get seemingly contradictory situations where a comment on an entry pre-dates the entry. So if it came down a choice of one or the other, I would pick "published," but if a consensus supports requiring published and modified, I would support that.

[ZhangYining] The last-modified-date is, IMHO, necessary. The concurrent edits causing clobbering problem brought up by AaronSw regarding the EchoAPI (see RestEchoApiDiscuss). Comparing last-modified-dates might hopefully fix the problem.


These definitions may be clearer:

post-creation-timestamp

when the post's content was created (mostly used internally, see BiblioGraphy)

post-last-modified-timestamp

when the post's content was last changed (mostly used internally, see BiblioGraphy), defaults to 'post-creation-timestamp'

publication-timestamp

when the post's content was (is scheduled to be) first published, usually from 'post-creation-timestamp'

The creation and last-modified timestamps of the representation of a post (i.e. the HTML files generated when combining the post'd content with a template) is outside the scope of a well-formed log entry discussion. In the event a system does not track the post-creation and post-last-modified timestamps, the representation's created and last-modified timestamps may be used in substitution.

It seems important that, given the number of sites that use dates in their permalink URIs, that entry-date correspond to the permalink date, and thus the publication-timestamp.

The opposite may be true, the permalink matters for identification, andpublication-timestamp should be post-last-modified-timestamp to indicate versioning. See discussion in PermaLinks.

Dublin Core

DublinCore develops metadata standards and metadata vocabularies for describing resources. DublinCore standards are layered, including definition of terms, use within other standards, and syntaxes.

Use in ''describing'' the model of a WellFormedEntry

DublinCore defines a [WWW]date term as "A date associated with an event in the life cycle of the resource" with a comment, "Typically, Date will be associated with the creation or availability of the resource."

Additional, more specific, terms, called [WWW]refinements include:

created

Date of creation of the resource.

valid

Date (often a range) of validity of a resource.

available

Date (often a range) that the resource will become or did become available.

issued

Date of formal issuance (e.g., publication) of the resource.

modified

Date on which the resource was changed.

dateAccepted

Date of acceptance of the resource (e.g. of thesis by university department, of article by journal, etc.).

dateCopyrighted

Date of a statement of copyright.

dateSubmitted

Date of submission of the resource (e.g. thesis, articles, etc.).

At this level, these are not indications of syntax, just definitions.

This is seperate proposal from whether Dublin Core specific terms are used in the data model or syntax (below).

For example, within the model, we could say that "post date" is the "creation or modified date".

Use of Dublin Core terms as-is in a WellFormedEntry

Within the model, we could adopt the definitions of one or more Dublin Core terms directly. The "date" of an Entry is an "unqualified Dublin Core date", which would include its less specific meaning of "a date associated with an event in the life cycle of the Entry, typically its creation or availibility date."

Logical, even if similar physically, extensions could then be used to be more specific, including 'created', 'issued', 'available', and 'modified'.

Using Dublin Core terms directly is option (4) at the top of the page.

This is a seperate proposal from whether any particular syntax or representation is used.


CategoryMetadata, CategoryModel