Meta Charset Update
Anne van Kesteren: If your host has already configured your server like this you can not alter the character encoding using a META element. Every document that suggests otherwise is incorrect.
Nearly ten months ago, I set out to tackle internationalization issues on my weblog. My research included not only the specs, but experimentation with web browser software. My conclusion at the time was that they were out of sync.
Time for an update. For starters, Anne points to a new emerging standard that is consistent with previous W3C specs and tutorials.
But do these specs represent reality? In my DevCon 2004 slides, I asserted otherwise. This was based on testing I had done, in particular two tests:
- iso-8859-1: While the HTTP content-type header specifies that this page is utf-8, it actually contains iso-8859-1 content, and it does everything it can within the page to say so.
- utf-8: Again, the HTTP content-type header specifies that this page is utf-8, this time correctly. However, the XML declaration and the meta tag attempt to fool the browser into thinking otherwise.
I'm now getting different results than the ones I reported on last time, ones that are more consistent with the standards as written. Perhaps the declarations that XML on the Web Has Failed were premature, we need to only give it more time?
On the other hand, and on a much narrower scope, the consensus continues to build that any notion that HTTP has a meaningful default charset continues to be foolish.
Meanwhile, try these two tests above, and if you get any interesting results, please leave a comment specifying what you saw and what browser (including version) you used. As I understand the specs, iso-8859-1 should be treated as if it had unprintable characters in it, and utf-8 should display correctly. After you view each page, try a refresh, particularly in IE.