Chapter 1: Building Abstractions with Functions by Harold Abelson, Gerald Jay Sussman, Julie Sussman - HTML preview

/ Home / Technology / Chapter 1: Building Abstractions with Functions

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

Interfaces. Message passing not only provides a method for coupling behavior and data, it allows different

data types to respond to the same message in different ways. A shared message that elicits similar behavior from

different object classes is a powerful method of abstraction.

As we have seen, an abstract data type is defined by constructors, selectors, and additional behavior conditions.

A closely related concept is an interface, which is a set of shared messages, along with a specification of what they

mean. Objects that respond to the special __repr__ and __str__ methods all implement a common interface

of types that can be represented as strings.

In the case of complex numbers, the interface needed to implement arithmetic consists of four messages:

real, imag, magnitude, and angle. We can implement addition and multiplication in terms of these mes-

sages.

We can have two different abstract data types for complex numbers that differ in their constructors.

• ComplexRI constructs a complex number from real and imaginary parts.

• ComplexMA constructs a complex number from a magnitude and angle.

With these messages and constructors, we can implement complex arithmetic.

>>> def add_complex(z1, z2):

return ComplexRI(z1.real + z2.real, z1.imag + z2.imag)

>>> def mul_complex(z1, z2):

return ComplexMA(z1.magnitude * z2.magnitude, z1.angle + z2.angle)

The relationship between the terms “abstract data type” (ADT) and “interface” is subtle. An ADT includes

ways of building complex data types, manipulating them as units, and selecting for their components. In an

object-oriented system, an ADT corresponds to a class, although we have seen that an object system is not needed

to implement an ADT. An interface is a set of messages that have associated meanings, and which may or may

not include selectors. Conceptually, an ADT describes a full representational abstraction of some kind of thing,

whereas an interface specifies a set of behaviors that may be shared across many things.

Properties. We would like to use both types of complex numbers interchangeably, but it would be wasteful to

store redundant information about each number. We would like to store either the real-imaginary representation

or the magnitude-angle representation.

Python has a simple feature for computing attributes on the fly from zero-argument functions. The @property

decorator allows functions to be called without the standard call expression syntax. An implementation of complex

numbers in terms of real and imaginary parts illustrates this point.

>>> from math import atan2

>>> class ComplexRI(object):

def __init__(self, real, imag):

self.real = real

self.imag = imag

@property

def magnitude(self):

return (self.real ** 2 + self.imag ** 2) ** 0.5

@property

def angle(self):

return atan2(self.imag, self.real)

def __repr__(self):

return ’ComplexRI({0}, {1})’.format(self.real, self.imag)

A second implementation using magnitude and angle provides the same interface because it responds to the

same set of messages.

>>> from math import sin, cos

>>> class ComplexMA(object):

def __init__(self, magnitude, angle):

self.magnitude = magnitude

self.angle = angle

@property

def real(self):

return self.magnitude * cos(self.angle)

@property

def imag(self):

return self.magnitude * sin(self.angle)

def __repr__(self):

return ’ComplexMA({0}, {1})’.format(self.magnitude, self.angle)

In fact, our implementations of add_complex and mul_complex are now complete; either class of com-

plex number can be used for either argument in either complex arithmetic function. It is worth noting that the

object system does not explicitly connect the two complex types in any way (e.g., through inheritance). We have

implemented the complex number abstraction by sharing a common set of messages, an interface, across the two

classes.

>>> from math import pi

>>> add_complex(ComplexRI(1, 2), ComplexMA(2, pi/2))

ComplexRI(1.0000000000000002, 4.0)

>>> mul_complex(ComplexRI(0, 1), ComplexRI(0, 1))

ComplexMA(1.0, 3.141592653589793)

The interface approach to encoding multiple representations has appealing properties. The class for each

representation can be developed separately; they must only agree on the names of the attributes they share. The

interface is also additive. If another programmer wanted to add a third representation of complex numbers to the

same program, they would only have to create another class with the same attributes.

Special methods. The built-in mathematical operators can be extended in much the same way as repr; there

are special method names corresponding to Python operators for arithmetic, logical, and sequence operations.

To make our code more legible, we would perhaps like to use the + and * operators directly when adding and

multiplying complex numbers. Adding the following methods to both of our complex number classes will enable

these operators to be used, as well as the add and mul functions in the operator module:

>>> ComplexRI.__add__ = lambda self, other: add_complex(self, other)

>>> ComplexMA.__add__ = lambda self, other: add_complex(self, other)

>>> ComplexRI.__mul__ = lambda self, other: mul_complex(self, other)

>>> ComplexMA.__mul__ = lambda self, other: mul_complex(self, other)

Now, we can use infix notation with our user-defined classes.

>>> ComplexRI(1, 2) + ComplexMA(2, 0)

ComplexRI(3.0, 2.0)

>>> ComplexRI(0, 1) * ComplexRI(0, 1)

ComplexMA(1.0, 3.141592653589793)

Further reading. To evaluate expressions that contain the + operator, Python checks for special methods on

both the left and right operands of the expression. First, Python checks for an __add__ method on the value of

the left operand, then checks for an __radd__ method on the value of the right operand. If either is found, that

method is invoked with the value of the other operand as its argument.

Similar protocols exist for evaluating expressions that contain any kind of operator in Python, including slice

notation and Boolean operators. The Python docs list the exhaustive set of method names for operators. Dive into Python 3 has a chapter on special method names that describes many details of their use in the Python interpreter.

2.7.3 Generic Functions

Our implementation of complex numbers has made two data types interchangeable as arguments to the add_complex

and mul_complex functions. Now we will see how to use this same idea not only to define operations that are

generic over different representations but also to define operations that are generic over different kinds of argu-

ments that do not share a common interface.

The operations we have defined so far treat the different data types as being completely independent. Thus,

there are separate packages for adding, say, two rational numbers, or two complex numbers. What we have not yet

considered is the fact that it is meaningful to define operations that cross the type boundaries, such as the addition

of a complex number to a rational number. We have gone to great pains to introduce barriers between parts of our

programs so that they can be developed and understood separately.

We would like to introduce the cross-type operations in some carefully controlled way, so that we can support

them without seriously violating our abstraction boundaries. There is a tension between the outcomes we desire:

we would like to be able to add a complex number to a rational number, and we would like to do so using a generic

add function that does the right thing with all numeric types. At the same time, we would like to separate the

concerns of complex numbers and rational numbers whenever possible, in order to maintain a modular program.

Let us revise our implementation of rational numbers to use Python’s built-in object system. As before, we

will store a rational number as a numerator and denominator in lowest terms.

>>> from fractions import gcd

>>> class Rational(object):

def __init__(self, numer, denom):

g = gcd(numer, denom)

self.numer = numer // g

self.denom = denom // g

def __repr__(self):

return ’Rational({0}, {1})’.format(self.numer, self.denom)

Adding and multiplying rational numbers in this new implementation is similar to before.

>>> def add_rational(x, y):

nx, dx = x.numer, x.denom

ny, dy = y.numer, y.denom

return Rational(nx * dy + ny * dx, dx * dy)

>>> def mul_rational(x, y):

return Rational(x.numer * y.numer, x.denom * y.denom)

Type dispatching. One way to handle cross-type operations is to design a different function for each possible

combination of types for which the operation is valid. For example, we could extend our complex number im-

plementation so that it provides a function for adding complex numbers to rational numbers. We can provide this

functionality generically using a technique called dispatching on type.

The idea of type dispatching is to write functions that first inspect the type of argument they have received,

and then execute code that is appropriate for the type. In Python, the type of an object can be inspected with the

built-in type function.

>>> def iscomplex(z):

return type(z) in (ComplexRI, ComplexMA)

>>> def isrational(z):

return type(z) == Rational

In this case, we are relying on the fact that each object knows its type, and we can look up that type us-

ing the Python type function. Even if the type function were not available, we could imagine implement-

ing iscomplex and isrational in terms of a shared class attribute for Rational, ComplexRI, and

ComplexMA.

Now consider the following implementation of add, which explicitly checks the type of both arguments. We

will not use Python’s special methods (i.e., __add__) in this example.

>>> def add_complex_and_rational(z, r):

return ComplexRI(z.real + r.numer/r.denom, z.imag)

>>> def add(z1, z2):

"""Add z1 and z2, which may be complex or rational."""

if iscomplex(z1) and iscomplex(z2):

return add_complex(z1, z2)

elif iscomplex(z1) and isrational(z2):

return add_complex_and_rational(z1, z2)

elif isrational(z1) and iscomplex(z2):

return add_complex_and_rational(z2, z1)

else:

return add_rational(z1, z2)

This simplistic approach to type dispatching, which uses a large conditional statement, is not additive. If

another numeric type were included in the program, we would have to re-implement add with new clauses.

We can create a more flexible implementation of add by implementing type dispatch through a dictionary.

The first step in extending the flexibility of add will be to create a tag set for our classes that abstracts away from

the two implementations of complex numbers.

>>> def type_tag(x):

return type_tag.tags[type(x)]

>>> type_tag.tags = {ComplexRI: ’com’, ComplexMA: ’com’, Rational: ’rat’}

Next, we use these type tags to index a dictionary that stores the different ways of adding numbers. The keys

of the dictionary are tuples of type tags, and the values are type-specific addition functions.

>>> def add(z1, z2):

types = (type_tag(z1), type_tag(z2))

return add.implementations[types](z1, z2)

This definition of add does not have any functionality itself; it relies entirely on a dictionary called add.implementations to implement addition. We can populate that dictionary as follows.

>>> add.implementations = {}

>>> add.implementations[(’com’, ’com’)] = add_complex

>>> add.implementations[(’com’, ’rat’)] = add_complex_and_rational

>>> add.implementations[(’rat’, ’com’)] = lambda x, y: add_complex_and_rational(y, x)

>>> add.implementations[(’rat’, ’rat’)] = add_rational

This dictionary-based approach to dispatching is additive, because add.implementations and type_tag.tags

can always be extended. Any new numeric type can “install” itself into the existing system by adding new entries

to these dictionaries.

While we have introduced some complexity to the system, we now have a generic, extensible add function

that handles mixed types.

>>> add(ComplexRI(1.5, 0), Rational(3, 2))

ComplexRI(3.0, 0)

>>> add(Rational(5, 3), Rational(1, 2))

Rational(13, 6)

Data-directed programming. Our dictionary-based implementation of add is not addition-specific at all; it

does not contain any direct addition logic. It only implements addition because we happen to have populated its

implementations dictionary with functions that perform addition.

A more general version of generic arithmetic would apply arbitrary operators to arbitrary types and use a dic-

tionary to store implementations of various combinations. This fully generic approach to implementing methods

is called data-directed programming. In our case, we can implement both generic addition and multiplication

without redundant logic.

>>> def apply(operator_name, x, y):

tags = (type_tag(x), type_tag(y))

key = (operator_name, tags)

return apply.implementations[key](x, y)

In this generic apply function, a key is constructed from the operator name (e.g., ’add’) and a tuple of type

tags for the arguments. Implementations are also populated using these tags. We enable support for multiplication

on complex and rational numbers below.

>>> def mul_complex_and_rational(z, r):

return ComplexMA(z.magnitude * r.numer / r.denom, z.angle)

>>> mul_rational_and_complex = lambda r, z: mul_complex_and_rational(z, r)

>>> apply.implementations = {(’mul’, (’com’, ’com’)): mul_complex,

(’mul’, (’com’, ’rat’)): mul_complex_and_rational,

(’mul’, (’rat’, ’com’)): mul_rational_and_complex,

(’mul’, (’rat’, ’rat’)): mul_rational}

We can also include the addition implementations from add to apply, using the dictionary update method.

>>> adders = add.implementations.items()

>>> apply.implementations.update({(’add’, tags):fn for (tags, fn) in adders})

Now that apply supports 8 different implementations in a single table, we can use it to manipulate rational and

complex numbers quite generically.

>>> apply(’add’, ComplexRI(1.5, 0), Rational(3, 2))

ComplexRI(3.0, 0)

>>> apply(’mul’, Rational(1, 2), ComplexMA(10, 1))

ComplexMA(5.0, 1)

This data-directed approach does manage the complexity of cross-type operators, but it is cumbersome. With

such a system, the cost of introducing a new type is not just writing methods for that type, but also the construction

and installation of the functions that implement the cross-type operations. This burden can easily require much

more code than is needed to define the operations on the type itself.

While the techniques of dispatching on type and data-directed programming do create additive implementa-

tions of generic functions, they do not effectively separate implementation concerns; implementors of the indi-

vidual numeric types need to take account of other types when writing cross-type operations. Combining rational

numbers and complex numbers isn’t strictly the domain of either type. Formulating coherent policies on the di-

vision of responsibility among types can be an overwhelming task in designing systems with many types and

cross-type operations.

Coercion. In the general situation of completely unrelated operations acting on completely unrelated types,

implementing explicit cross-type operations, cumbersome though it may be, is the best that one can hope for.

Fortunately, we can sometimes do better by taking advantage of additional structure that may be latent in our type

system. Often the different data types are not completely independent, and there may be ways by which objects of

one type may be viewed as being of another type. This process is called coercion. For example, if we are asked to

arithmetically combine a rational number with a complex number, we can view the rational number as a complex

number whose imaginary part is zero. By doing so, we transform the problem to that of combining two complex

numbers, which can be handled in the ordinary way by add_complex and mul_complex.

In general, we can implement this idea by designing coercion functions that transform an object of one type

into an equivalent object of another type. Here is a typical coercion function, which transforms a rational number

to a complex number with zero imaginary part:

>>> def rational_to_complex(x):

return ComplexRI(x.numer/x.denom, 0)

Now, we can define a dictionary of coercion functions. This dictionary could be extended as more numeric

types are introduced.

>>> coercions = {(’rat’, ’com’): rational_to_complex}

It is not generally possible to coerce an arbitrary data object of each type into all other types. For example,

there is no way to coerce an arbitrary complex number to a rational number, so there will be no such conversion

implementation in the coercions dictionary.

Using the coercions dictionary, we can write a function called coerce_apply, which attempts to coerce

arguments into values of the same type, and only then applies an operator. The implementations dictionary of

coerce_apply does not include any cross-type operator implementations.

>>> def coerce_apply(operator_name, x, y):

tx, ty = type_tag(x), type_tag(y)

if tx != ty:

if (tx, ty) in coercions:

tx, x = ty, coercions[(tx, ty)](x)

elif (ty, tx) in coercions:

ty, y = tx, coercions[(ty, tx)](y)

else:

return ’No coercion possible.’

key = (operator_name, tx)

return coerce_apply.implementations[key](x, y)

The implementations of coerce_apply require only one type tag, because they assume that both

values share the same type tag. Hence, we require only four implementations to support generic arithmetic over

complex and rational numbers.

>>> coerce_apply.implementations = {(’mul’, ’com’): mul_complex,

(’mul’, ’rat’): mul_rational,

(’add’, ’com’): add_complex,

(’add’, ’rat’): add_rational}

With these implementations in place, coerce_apply can replace apply.

>>> coerce_apply(’add’, ComplexRI(1.5, 0), Rational(3, 2))

ComplexRI(3.0, 0)

>>> coerce_apply(’mul’, Rational(1, 2), ComplexMA(10, 1))

ComplexMA(5.0, 1.0)

This coercion scheme has some advantages over the method of defining explicit cross-type operations. Al-

though we still need to write coercion functions to relate the types, we need to write only one function for each

pair of types rather than a different functions for each collection of types and each generic operation. What we

are counting on here is the fact that the appropriate transformation between types depends only on the types

themselves, not on the particular operation to be applied.

Further advantages come from extending coercion. Some more sophisticated coercion schemes do not just try

to coerce one type into another, but instead may try to coerce two different types each into a third common type.

Consider a rhombus and a rectangle: neither is a special case of the other, but both can be viewed as quadrilaterals.

Another extension to coercion is iterative coercion, in which one data type is coerced into another via intermediate

types. Consider that an integer can be converted into a real number by first converting it into a rational number,

then converting that rational number into a real number. Chaining coercion in this way can reduce the total number

of coercion functions that are required by a program.

Despite its advantages, coercion does have potential drawbacks. For one, coercion functions can lose infor-

mation when they are applied. In our example, rational numbers are exact representations, but become approxi-

mations when they are converted to complex numbers.

Some programming languages have automatic coercion systems built in. In fact, early versions of Python had a

__coerce__ special method on objects. In the end, the complexity of the built-in coercion system did not justify

its use, and so it was removed. Instead, particular operators apply coercion to their arguments as needed. Operators

are implemented as method calls on user defined types using special methods like __add__ and __mul__. It

is left up to you, the user, to decide whether to employ type dispatching, data-directed programming, message

passing, or coercion in order to implement generic functions in your programs.

computation based on that text. A programming language like Python is useful because we can define an inter-

preter, a program that carries out Python’s evaluation and execution procedures. It is no exaggeration to regard this

as the most fundamental idea in programming, that an interpreter, which determines the meaning of expressions

in a programming language, is just another program.

To appreciate this point is to change our images of ourselves as programmers. We come to see ourselves as

designers of languages, rather than only users of languages designed by others.

3.1.1 Programming Languages

In fact, we can regard many programs as interpreters for some language. For example, the constraint propagator

from the previous chapter has its own primitives and means of combination. The constraint language was quite

specialized: it provided a declarative method for describing a certain class of mathematical relations, not a fully

general language for describing computation. While we have been designing languages of a sort already, the

material of this chapter will greatly expand the range of languages we can interpret.

Programming languages vary widely in their syntactic structures, features, and domain of application. Among

general purpose programming languages, the constructs of function definition and function application are perva-

sive. On the other hand, powerful languages exist that do not include an object system, higher-order functions,

or even control constructs like while and for statements. To illustrate just how different languages can be, we

will introduce Logo as an example of a powerful and expressive programming language that includes very few advanced features.

In this chapter, we study the design of interpreters and the computational processes that they create when

executing programs. The prospect of designing an interpreter for a general programming language may seem

daunting. After all, interpreters are programs that can carry out any possible computation, depending on their

input. However, typical interpreters have an elegant common structure: two mutually recursive functions. The

first evaluates expressions in environments; the second applies functions to arguments.

These functions are recursive in that they are defined in terms of each other: applying a function requires

evaluating the expressions in its body, while evaluating an expression may involve applying one or more functions.

The next two sections of this chapter focus on recursive functions and data structures, which will prove essential

to understanding the design of an interpreter. The end of the chapter focuses on two new languages and the task

of implementing interpreters for them.

3.2 Functions and the Processes They Generate

A function is a pattern for the local evolution of a computational process. It specifies how each stage of the

process is built upon the previous stage. We would like to be able to make statements about the overall behavior

of a process whose local evolution has been specified by one or more functions. This analysis is very difficult to

do in general, but we can at least try to describe some typical patterns of process evolution.

In this section we will examine some common “shapes” for processes generated by simple functions. We will

also investigate the rates at which these processes consume the important computational resources of time and

space.

3.2.1 Recursive Functions

A function is called recursive if the body of that function calls the function itself, either directly or indirectly.

That is, the process of executing the body of a recursive function may in turn require applying that function again.

Recursive functions do not require any special syntax in Python, but they do require some care to define correctly.

As an introduction to recursive functions, we begin with the task of converting an English word into its Pig

Latin equivalent. Pig Latin is a secret language: one that applies a