«
»
UTF-8
UTF-8 has a nice property:
UTF-8 strings can be fairly reliably recognized as such by a simple algorithm, i.e. the probability that a string of characters in any other encoding appears as valid UTF-8 is low, diminishing with increasing string length.
It also is implemented properly by Microsoft