intertwingly

It’s just data

Sometimes the dragon wins


Scott Johnson: ɥɦɐ I just had to try out some funky characters to see what would happen.  :)

An advantage of declaring this page as utf-8 is that I can distinguish between somebody typing ɥɦɐ and ɥɦɐ, meaning that people don’t have to double escape if they want to talk about numeric entities on my weblog.

But don’t try to search for ɥɦɐ.  While such a query will be properly URI encoded based on utf-8, that particular string does not appear in any text files.

So, sometimes the dragon wins.  If you have a requirement for full text search, and you haven’t outsourced it to google, then you need a database that understands encodings, and all of Julik’s points apply.

Before I deploy my Ruby based weblog, I want to make sure that both fastcgi and a database that supports utf-8 are in place (Cornerhost is currently running mysql 3.23.58).

Some footnotes: