Last
week, I created a
nightly
job to verify that my inputs are clean, well formed XML.
That took care of my inputs, but it didn't verify the process by
which the web pages were created.
I've since added some code to verify that each of the pages in
the cache of pages served in the past 24 hours are well formed and
valid XHTML. This uncovered an interesting boundary case that
I hadn't considered.
I started using — instead of two hyphens. Besides being valid in more situations, it is probably a more proper use of punctuation in general (and hey, maybe less ambiguous semantically).
Jay, good suggestion - for the future. Meanwhile, I now have some defensive code in place.
Basil, good catch. But I'm concerned that such a meta tag would cause IE to throw up a hairball or something. In any case, I see no need for that particular meta tag in this instance, so I am removing it. Given the way I cache pages, it will be a few days before this ripples through all of my entries.
Note that a meta http-equiv statement will not be recognized by XML processors, and authors SHOULD NOT include such a statement in an XHTML document served as 'application/xml' (and 'application/xhtml+xml' as well for that matter).
Jay: started using — instead of two hyphens.
Sam: Jay, good suggestion - for the future. Meanwhile, I now have some defensive code in place.
Can — be allowed in an xml without being defined in the DTD ? (are you planning to include DTD definitions for all such entities in your feed?) I believe.. the NCR form is right way to go!