Python 3.0a1
Guido van Rossum: The first Python 3000 release is out — Python 3.0a1. Be the first one on your block to download it!
$ python3.0
Python 3.0a1 (py3k, Aug 31 2007, 21:24:31)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print(len('Iñtërnâtiônàlizætiøn'))
20
>>>
:-)
Let’s try it out, based on this:
rubys@rubypad:~$ python3.0
Python 3.0a1 (py3k, Aug 31 2007, 21:24:31)
[GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> '𐌰𐍄𐍄𐌰 𐌿𐌽𐍃𐌰𐍂'
'\U00010330\U00010344\U00010344\U00010330 \U0001033f\U0001033d\U00010343\U00010330\U00010342'
>>> len('𐌰𐍄𐍄𐌰 𐌿𐌽𐍃𐌰𐍂')
19
>>>
Looks like I’ve compiled targeting UCS2.
Posted by Sam Ruby atOkay, so it is the same situation as for Python 2.x. Things get really confusing when you index into a string and get back half a character ...
Posted by James Henstridge at
Sam Ruby: Python 3.0a1
“>>> print(len('Iñtërnâtiônàlizætiøn')) 20”...Excerpt from del.icio.us/edcrypt at
5 Apr 2008
Py3k I18n Improving on Sam Ruby’s example , to show that, in Python 3.0, code (names) can use unicode characters (also, the default encoding of the interpreter now is utf-8): $ python3.0 Python 3.0a3+ (py3k:61959, Mar 26 2008, 21:02:26) [GCC 4.2.3...Excerpt from Advogato blog for eopadoan at
How does it fair with characters outside of the basic multilingual plane? From memory, Python 2.x gives different answers depending on whether it was compiled in UCS2 or UCS4 mode.
[I guess I’ll find out for myself once I compile it ...]
Posted by James Henstridge at