tr and Regexp now understand Unicode Ruby 1.9 unicode(string).tr(CP1252_DIFFERENCES, UNICODE_EQUIVALENT). gsub(INVALID_XML_CHAR, REPLACEMENT_CHAR). gsub(XML_PREDEFINED) {|c| PREDEFINED[c.ord]}