Python vs Parrot
In many ways, it seems like Python and Parrot are from different planets.
In Python, the general approach seems to be to reduce everything possible to a canonical form as early as possible, and then deal with everything consistently.
In Parrot, the general approach seems to be to leave everything in its original form as long as possible, and then deal with everything separately.
Example
A simple example, incrementing a
in Python:
LOAD_FAST a LOAD_CONST 1 BINARY_ADD STORE_FAST a
And in Parrot:
find_lex P1, a add P1, P1, 1
The first difference you may notice is that Python is stack
based, and Parrot is register based; but the difference I want to
focus on is the add operation itself. In Python, the
BINARY_ADD operation is generic and can handle everything from
integers to floating point to strings. For this to work, the
numeric 1
must first be converted to an object (a
process often referred to as boxing) pushed on the stack.
BINARY_ADD will then pull two objects off of the stack, unboxing as
appropriate, do the appropriate operation, box up the results, and
then push it back on the stack.
In Parrot, the boxing and unboxing is deferred... there is a separate and unique opcode for adding an integer to an object (a.k.a. PMC). This is in addition to opcodes that add floating point numbers to an object, adding an integer to an integer, an object to an object, etc. This requires more special cases to be handled, but the payoff is that with this additional development work, runtime work can be eliminated. In this case, all of the boxing and unboxing can be avoided.
In absolute terms, boxing and unboxing is not very expensive. But in relative terms (and in this case, what it is to be compared against is simple integer addition), it can be very significant.
With the way Parrot is structured, much of the development
overhead can be eliminated. Not every object class that
wishes to provide an add operator needs to implement an
add_int
method. By using a common base class, a
generic add_int
method can be provided that boxes up
the int and calls a single add
method designed to work
on objects. Such a technique allows subclasses for which
add_int
is a common enough operation worthy of
optimization to do so directly, without burdening all other
subclasses with the need to do so.
Goals
The first goal of a Python on Parrot implementation needs to be fidelity to CPython implementation. Otherwise, you are simply implementing a Python-like language on top of another runtime. Such a language would not be able to make use of the full range of existing Python libraries and scripts.
However, that goal, by itself, is insufficient. There already is a CPython implementation. Potential secondary goals include better performance and better integration with other languages. Both of those goals ultimately require some trade-offs to be made with respect to the first goal.
Most of the performance trade-offs can be made without compromise to functionality. Best cases are when common scenarios are made significantly faster at an marginal expense to less common scenarios.
The integration scenarios are trickier. Perl's integer divide has different semantics than Python's, particularly for negative numbers. What does dividing a Perl integer by a Python integer mean? If two Perl integers are passed to a Python function which attempts to divide them, what should be done?
The same operation that does a binary arithmetic also does string concatenation in Python.
These are admittedly edge cases. But such edge cases
abound. Python has a dict
as a fundamental data
type. Perl has a hash
. Keys of Python
dictionaries can be any immutable value. Keys of Perl5 hashes
can only be strings. More significantly is the impact of
Duck
Typing. If somebody passes a Perl hash to a Python
function, the Python function expects there to be a
fair number
of methods at its disposal. How much of this can be
papered over, and how much of this will show through is still a
research topic at the moment.
Fundamentals
To date, I've found a number of areas that are more fundamentally different between Python and Parrot than any of the examples above. The two implementations of Python on Parrot that I have looked at, namely pie-thon and pirate, approach these differently.
The first deals with the extent of the Python canonicalization mentioned above. In Parrot, instances may have properties, methods, and attributes. In Python, there are only attributes. This is possible as functions, methods, and even classes are also objects in Python, so each are possible values for a given attribute.
In the pie-thon implementation of Python on Parrot, all methods are attributes. In Pirate, all methods are properties. The implication being that from the perspective of a language like Perl, such Python objects will have no methods defined.
This can be dealt with by implementing a
find_method
method in PyClass that searches first the
set of methods, and then the attributes/properties.
More troublesome is the issue of naming. In Parrot, the presumption is that all subroutines and classes are globally named. In Python, such names are lexically scoped. It is quite legal to have multiple methods in the same scope with the same name, in fact, the syntax to define a class in Python really only creates an anonymous class object and assigns it to a (lexically scoped) variable. The only names that are global in Python are module names. Modules in Python are used to address much the same types of problems that namespaces do in Parrot, but again, are fundamentally different.
Here's an example that can't be handled by pie-thon currently:
def f(x): class c: def m(SELF): return 0 if x<0: class c: def m(SELF): return -1 if x>0: class c: def m(SELF): return +1 return c() print f(7).m(), f(0).m(), f(-7).m()
But even that can be largely masked by clever compilers. Pirate addresses this with a bit of name mangling.
A difference that can't be masked at all is a difference that
isn't there. In Python there is no vocabulary for "new-ing"
up an instance of a class. Instead, the __call__
method on the class is expected to act like a factory. Non
python libraries will either be required to mimic this behavior, or
an alternate syntax (perhaps a Parrot
module which
exports a new
function) will need to be provided.
Status and Plans
Michal Wallace has given me commit access to Pirate, and I've made a number of small fixes. But mostly, I've been holding back until I can get a new set of python specific classes implemented and committed.
Leopold Tötsch has been committing (most of) my patches to Parrot, and now I am ready to have a largish one committed. It is mostly new Python specific dynclass sources, with some small mods to the system to make it work. Once that is committed, I'll update Pirate, and both will once again pass all defined tests.
At that point, I plan to do two activities. One is to refactor as much of the existing logic in Pirate into Parrot dynclasses as possible. The other is to expand the test suite and use that to drive the addition of new functionality. Two sources of tests will be the parrotbench and the CPython unit test suite.