Programming Languages: Application and Interpretation by Shriram Krishnamurthi - HTML preview

PLEASE NOTE: This is an HTML preview only and some elements such as links or page numbers may be incorrect.
Download the book in PDF, ePub, Kindle for a complete version.

named-expr

env))]))

Figure 10.2: Recursion: Interpreter

Part V

Intermezzo

103

Chapter 11

Representation Choices

Having grown comfortable with environments, let’s try to get to the essence of what an environment is. This

will have an interesting (and perhaps surprising) implication for their representation, which in turn will lead

us to a deeper investigation of representations in general.

11.1

Representing Environments

We have seen one way of implementing the environment, which is as a list- or stack-like datatype:

(define-type Env

[mtSub]

[aSub (name symbol?)

(value FAE-Value?)

(env Env?)])

The environment, as we’ve noted, is a mapping from identifiers to values. But it’s a particular kind of

mapping: whenever we look up the value of an identifier, we want to get at most one value for it. That is,

the environment is just a (partial) function.

In a language like Scheme, we can implement a partial function directly using Scheme’s functions,

without having to go through a data structure representation. First, let’s define a predicate for our new

representation:

(define (Env? x)

(procedure? x))

This predicate is not exact, but it’ll suffice for our purposes. Using this representation, we have a different

way of implementing aSub (the contract stays the same):

(define (aSub bound-name bound-value env)

(lambda (want-name)

(cond

[(symbol=? want-name bound-name) bound-value]

[else (lookup want-name env)])))

105

106

CHAPTER 11. REPRESENTATION CHOICES

The function aSub must return an environment and, since we’ve chosen to represent environments by

Scheme functions (lambdas), aSub must return a function. This explains the lambda.

An environment is a function that consumes one argument, a want-name, the name we’re trying to look

up. It checks whether the name that name is the one bound by the current procedure. If it is, it returns the

bound value, otherwise it continues the lookup process. How does that work?

(define (lookup name env)

(env name))

A environment is just a procedure expecting an identifier’s name, so to look up a name, we simply apply it

to the name we’re looking for.

The implementation of the initial value, mtSub, is simply a function that consumes an identifier name

and halts in error:

(define (mtSub)

(lambda (name)

(error ’lookup ”no binding for identifier”)))

These changes are summarized in Figure 11.1. Given these changes, the core interpreter remains un-

changed from Figure 6.2.

11.2

Representing Numbers

Let’s consider our representation of numbers. We made the decision to represent FAE numbers as Scheme

numbers. Scheme numbers handle overflow automatically by growing as large as necessary. If we want

to have FAE numbers behave differently—for instance, by overflowing like Java’s numbers do—we would

need to use modular arithmetic that captures our desired overflow modes, instead of directly mapping the

operators in FAE to those in Scheme.

Because numbers are not as interesting as some of the other features we’ll be studying, we won’t be

conducting such an exercise. The relevant point is that when writing an interpreter, we get the power to make

these kinds of choices. A related choice, which is relevant to this text, is the representation of functions.

11.3

Representing Functions

What other representations are available for FAE functions (i.e., fun expressions)? Currently, our interpreter

uses a datatype. We might try to use strings or vectors; vectors would gain little over a datatype, and it’s

not quite clear how to use a string. One Scheme type that ought to be useful, however, is Scheme’s own

procedure mechanism, lambda. Let’s consider how that might work.

First, we need to change our representation of function values. We will continue to use a datatype, but

only to serve as a wrapper of the actual function representation (just like the numV clause only wraps the

actual number). That is,

(define-type FAE-Value

[numV (n number?)]

[closureV (p procedure?)])

11.4. TYPES OF INTERPRETERS

107

We will need to modify the fun clause of the interpreter. When we implemented environments with

procedures, we embedded a variant of the original lookup code in the redefined aSub. Here we do a similar

thing: we want FAE function application to be implemented with Scheme procedure application, so we

embed the original app code inside the Scheme procedure representing a FAE function.

[fun (bound-id bound-body)

(closureV (lambda (arg-val)

(interp bound-body

(aSub bound-id arg-val env))))]

That is, we construct a closureV that wraps a real Scheme closure. That closure takes a single value, which

is the value of the actual parameter. It then interprets the body in an extended environment that binds the

parameter to the argument’s value.

These changes should immediately provoke two important questions:

1. Which environment will the interpreter extend when evaluating the body? Because Scheme itself

obeys static scoping, the interpreter will automatically employ the environment active at procedure

creation. That is, Scheme’s lambda does the hard work so we can be sure to get the right cache.

2. Doesn’t the body get interpreted when we define the function? No, it doesn’t. It only gets evaluated

when something—hopefully the application clause of the interpreter—extracts the Scheme procedure

from the closureV value and applies it to the value of the actual parameter.

Correspondingly, application becomes

[app (fun-expr arg-expr)

(local ([define fun-val (interp fun-expr env)]

[define arg-val (interp arg-expr env)])

( (closureV-p fun-val)

arg-val))]

Having reduced the function and argument positions to values, the interpreter extracts the Scheme procedure

that represents the function (the boxed expression), and applies it to the argument value.

In short, a fun expression now evaluates to a Scheme procedure that takes a FAE value as its argument.

Function application in FAE is now just procedure application in Scheme. Figure 11.2 presents the entire

revised interpreter.

11.4

Types of Interpreters

We have seen a few different implementations of interpreters that are quite different in flavor. They suggest

the following taxonomy.

Definition 13 (syntactic interpreter) A syntactic interpreter is one that uses the interpreting language to

represent only terms of the interpreted language, implementing all the corresponding behavior explicitly.

108

CHAPTER 11. REPRESENTATION CHOICES

Definition 14 (meta interpreter) A meta interpreter is an interpreter that uses language features of the

interpreting language to directly implement behavior of the interpreted language.

While our substitution-based FAE interpreter was very nearly a syntactic interpreter, we haven’t written

any purely syntactic interpreters so far: even that interpreter directly relied on Scheme’s implementation of

numbers. The interpreter in Figure 11.2, which employs Scheme’s lambda (and its attendant static scope)

to represent fun, is distinctly a meta interpreter.

With a good match between the interpreted language and the interpreting language, writing a meta

interpreter can be very easy. With a bad match, though, it can be very hard. With a syntactic interpreter,

implementing each semantic feature will be somewhat hard,1 but in return you don’t have to worry as much

about how well the interpreting and interpreted languages correspond. In particular, if there is a particularly

strong mismatch between the interpreting and interpreted language, it may take less effort to write a syntactic

interpreter than a meta interpreter.

As an example, consider the implementation of laziness. Suppose we use Scheme closures as the repre-

sentation of functions, as in Figure 11.2. Function application in this language automatically becomes eager,

“inheriting” this behavior from Scheme’s eager evaluation semantics. If we instead wanted lazy evaluation,

we would have to expend some effort to “undo” the behavior inherited from Scheme and make application

lazy. Worse, we would be inheriting any subtleties that Scheme’s closure and application semantics might

posess.2 In contrast, the relatively syntactic interpreter given for laziness in Section 8 does not suffer from this peril.

Based on this discussion, we can now understand when it is and isn’t reasonable to exploit Scheme’s

representations, which may have seemed arbitrary until now. It is reasonable for features not under study

(such as numbers), but unreasonable for features directly under examination (such as function application,

when we’re studying functions—whether eager or lazy). Once we’ve provided a syntactic interpreter to

explain a feature, such as application and recursion, we can then exploit that feature in Scheme to build the

next level of complexity.

As an exercise, we can build upon our latest interpreter to remove the encapsulation of the interpreter’s

response in the FAE-Value type. The resulting interpreter is shown in Figure 11.3. This is a true meta

interpreter: it uses Scheme closures to implement FAE closures, Scheme procedure application for FAE

function application, Scheme numbers for FAE numbers, and Scheme arithmetic for FAE arithmetic. In fact,

ignoring some small syntactic differences between Scheme and FAE, this latest interpreter can be classified

as something more specific than a meta interpreter:

Definition 15 (meta-circular interpreter) A meta-circular interpreter is a meta interpreter in which the

interpreting and interpreted language are the same.

(Put differently, the trivial nature of the interpreter clues us in to the deep connection between the two

languages, whatever their syntactic differences may be.)

1Though a poor choice of meta language can make this much harder than necessary! We choose Scheme in part because it has

so many powerful features to draw upon in a meta interpreter.

2While Scheme has few dark corners, some versions of popular scripting languages have non-standard semantics for standard

constructs such as first-class functions. A meta interpreter that used these constructs directly would then inherit these dark corners,

probably inadvertently.

11.5. PROCEDURAL REPRESENTATION OF RECURSIVE ENVIRONMENTS

109

Meta-circular interpreters do very little to promote understanding. To gain an understanding of the

language being interpreted—one of the reasons we set out to write interpreters—we must already thoroughly

understand that language so we can write or understand the interpreter! Therefore, meta-circular interpreters

are elegant but not very effective educational tools (except, perhaps, when teaching one particular language).

In this text, therefore, we will write interpreters that balance between syntactic and meta elements, using

only those meta features that we have already understood well (such as Scheme closures). This is the only

meta-circular interpreter we will write.

That said, meta-circular interpreters do serve in one very important role: they’re good at finding weak-

nesses in language definitions! For instance, if you define a new scripting language, no doubt you will put

great effort into the design of its domain-specific features, such as those to parse data files or communicate

over a network. But will you get the domain-independent parts—procedures, scope, etc.—right also? And

how can you be sure? One good way is to try and write a meta, then meta-circular interpreter using the

language. You will probably soon discover all sorts of deficiencies in the core language. The failure to

apply this simple but effective experiment is partially responsible for the messy state of so many scripting

languages (Tcl, Perl, JavaScript, Python, etc.) for so long; only now are they getting powerful enough to

actually support effective meta-circular interpreters.

In short, by writing a meta-circular interpreter, you are likely to find problems, inconsistencies and, in

particular, weaknesses that you hadn’t considered before. In fact, some people would argue that a truly

powerful language is one that makes it easy to write its meta-circular interpreter.

Exercise 11.4.1 It is instructive to extend the Haskell interpreter given in Section 7.1 to implement recur-

sion. Use the data structure representation of the environment. In Section 10, this required mutation. Haskell does not provide a mutation operation. Without it, are you able to implement recursion?

11.5

Procedural Representation of Recursive Environments

Section 11.1 introduced a second, procedural, representation of environments. Section 10 discussed the implementation of recursion using a data structure representation of environments. We should therefore

consider whether we can implement recursion using the procedural representation.

Figure 10.2 presents a core interpreter that is relatively independent of the representation of environ-

ments. To enable recursion, we simply need to provide an implementation of the key helper function:

;; cyclically-bind-and-interp : symbol fun env → env

which must begin as follows:

(define (cyclically-bind-and-interp bound-name named-expr env)

· · ·)

We know that the following code pattern must exist because of the nature of the procedural representation

of environments:

(define (cyclically-bind-and-interp bound-name named-expr env)

...

(lambda (want-name)

110

CHAPTER 11. REPRESENTATION CHOICES

(cond

[(symbol=? want-name bound-name)

· · ·]

[else (lookup want-name env)]))

· · ·)

If the symbols match, what do we want to return? Looking up an identifier in an environment produces

values. Recall that the named expression must be a function, so its value must be a closure. Thus, the

response if the symbols match must yield a closure:

(define (cyclically-bind-and-interp bound-name named-expr env)

...

(lambda (want-name)

(cond

[(symbol=? want-name bound-name)

(closureV (fun-param named-expr)

(fun-body named-expr)

· · ·)]

[else (lookup want-name env)]))

· · ·)

What’s not yet clear is what environment to close over. It clearly can’t be just env; it must also contain this

additional binding. So how about we give a name to this new environment that knows about the binding for

bound-name?

(define (cyclically-bind-and-interp bound-name named-expr env)

(local ([define rec-ext-env

(lambda (want-name)

(cond

[(symbol=? want-name bound-name)

(closureV (fun-param named-expr)

(fun-body named-expr)

· · ·)]

[else (lookup want-name env)]))])

· · ·))

Having named it, it’s now easy to fill in the two ellipses. What environment do we want to close over

in the closure? One that binds the function named in bound-name to the appropriate closure. This is

the environment rec-ext-env. What do we want to return from this procedure? The recursively extended

environment. This is also rec-ext-env. Thus, ignoring the box momentarily,

(define (cyclically-bind-and-interp bound-name named-expr env)

(local ([define rec-ext-env

(lambda (want-name)

(cond

[(symbol=? want-name bound-name)

11.5. PROCEDURAL REPRESENTATION OF RECURSIVE ENVIRONMENTS

111

(closureV (fun-param named-expr)

(fun-body named-expr)

rec-ext-env )]

[else (lookup want-name env)]))])

rec-ext-env))

The relevant portions of the interpreter are in Figure 10.2 and Figure 11.4. Notice that all the code common to Figure 11.4 and Figure 11.1 is identical.

Exercise 11.5.1 One difference between Figure 11.1 and Figure 11.4 is the addition of recursive environment bindings. When we added support for recursive bindings in Section 10, we had to modify lookup. Why

didn’t lookup change between Figure 11.1 and Figure 11.4?

This definition raises two natural questions:

1. Is this really a recursive environment? Yes it is, though you’ll just have to take the word of the authors

of DrScheme that local does indeed define rec-ext-env as a recursive procedure, so references to that

name in the procedure’s body will indeed refer back to the same procedure.

2. Doesn’t the boxed reference to rec-ext-env have the same problem we were trying to avoid with ex-

pressions such as {rec {x x} x}? Actually, it doesn’t. The reference here is “under a lambda”,

that is, it is separated from the binding instance by a procedure declaration. Therefore, when the

named expression portion of the local is evaluated, it associates a closure with rec-ext-env that doesn’t

get invoked until much later—by which time the recursive environment of the local is safely defined.

This is the same issue we discussed in Section 9.3.

Reassuring as these responses may be, there is still something deeply unsatisfying about this solution.

We set out to add recursive functions to RCFAE. We reduced this to the problem of defining recursive

environments, which is legitimate (and, arguably, recursive environments are easier to think about than

recursive functions themselves). But we then implemented recursive environments by falling right back on

Scheme’s recursive functions: an abuse of meta-interpretive power, if ever there was any! For this reason,

the syntactic interpreter given in Section 10 is superior: it doesn’t rely on advanced knowledge of Scheme

(or, at least, no knowledge of features that we don’t also find in more mainstream programming languages).

As an aside, this discussion highlights both a power and peril of meta-interpretive choices. The power

of choosing the procedural representation is that we can add recursion to the language very easily. If our

goal is to add it as quickly as possible, while minimizing error, it makes sense to exploit the effort put into

implementing recursion for Scheme. But the peril is that this implementation does not hold descriptive

power: it still begs the question of how to implement recursion from scratch.

Exercise 11.5.2 Is it possible to implement recursive environments using the procedural representation

without employing Scheme’s constructs for creating recursive procedures? That is, can FAE alone express

recursive functions?

Exercise 11.5.3 The two implementations differ slightly in the way they treat illegal named expressions (i.e.,

ones that are not syntactic procedures). Do you see why? How would you make them behave identically?

112

CHAPTER 11. REPRESENTATION CHOICES

(define (Env? x)

(procedure? x))

;; mtSub : () → Env

(define (mtSub)

(lambda (name)

(error ’lookup ”no binding for identifier”)))

;; aSub: symbol FAE-Value Env → Env

(define (aSub bound-name bound-value env)

(lambda (want-name)

(cond

[(symbol=? want-name bound-name) bound-value]

[else (lookup want-name env)])))

;; lookup : symbol Env → FAE-Value

(define (lookup name env)

(env name))

Figure 11.1: Procedural Representation of Environments

(define-type FAE-Value

[numV (n number?)]

[closureV (p procedure?)])

;; interp : FAE Env → FAE-Value

(define (interp expr env)

(type-case FAE expr

[num (n) (numV n)]

[add (l r) (num+ (interp l env) (interp r env))]

[id (v) (lookup v env)]

[fun (bound-id bound-body)

(closureV (lambda (arg-val)

(interp bound-body

(aSub bound-id arg-val env))))]

[app (fun-expr arg-expr)

(local ([define fun-val (interp fun-expr env)]

[define arg-val (interp arg-expr env)])

((closureV-p fun-val)

arg-val))]))

Figure 11.2: Procedural Representation of Functions

11.5. PROCEDURAL REPRESENTATION OF RECURSIVE ENVIRONMENTS

113

(define (number-or-procedure? v)

(or (number? v)

(procedure? v)))

(define-type Env

[mtSub]

[aSub (name symbol?) (value number-or-procedure?) (env Env?)])

;; lookup : symbol Env → number-or-procedure

(define (lookup name env)

(type-case Env env

[mtSub () (error ’lookup ”no binding for identifier”)]

[aSub (bound-name bound-value rest-env)

(if (symbol=? bound-name name)

bound-value

(lookup name rest-env))]))

;; interp : FAE Env → number-or-procedure

(define (interp expr env)

(type-case FAE expr

[num (n) n]

[add (l r) (+ (interp l env) (interp r env))]

[id (v) (lookup v env)]

[fun (bound-id bound-body)

(lambda (arg-val)

(interp bound-body

(aSub bound-id arg-val env)))]

[app (fun-expr arg-expr)

(local ([define fun-val (interp fun-expr env)]

[define arg-val (interp arg-expr env)])

(fun-val arg-val))]))

Figure 11.3: Meta-Circular Interpreter

114

CHAPTER 11. REPRESENTATION CHOICES

(define-type RCFAE-Value

[numV (n number?)]

[closureV (param symbol?)

(body RCFAE?)

(env Env?)])

(define (Env? x)

(procedure? x))

(define (mtSub)

(lambda (name)

(error ’lookup ”no binding for identifier”)))

(define (aSub bound-name bound-value env)

(lambda (want-name)

(cond

[(symbol=? want-name bound-name) bound-value]

[else (lookup want-name env)])))

(define (cyclically-bind-and-interp bound-name named-expr env)

(local ([define rec-ext-env

(lambda (want-name)

(cond

[(symbol=? want-name bound-name)

(closureV (fun-param named-expr)

(fun-body named-expr)

rec-ext-env)]

[else (lookup want-name env)]))])

rec-ext-env))

(define (lookup name env)

(env name))

Figure 11.4: Recursion: Support Code with Procedural Representation of Environments

Part VI

State

115

Chapter 12

Church and State

In Section 10 we freely employed Scheme boxes, but we haven’t yet given an account of how they work

(beyond an informal intuition). We therefore begin an investigation of boxes and other forms of mutation—

the changing of values associated with names—to endow a language with state.

Mutation is a standard feature in most programming languages. The programs we have written in

Scheme have, however, been largely devoid of state. Indeed, Haskell has no mutation operations at all.

It is, therefore, possible to design and use languages—even quite powerful ones—that have no explicit no-

tion of state. Simply because the idea that one can program without state hasn’t caught on in the mainstream

is no reason to reject it.

That said, state does have its place in computation. If we create programs to model the real world, then

some of those programs are going to have to accommodate the fact that there the real world has events that

truly alter it. For instance, cars really do consume fuel as they run, so a program that models a fuel tank

might best use state to record changes in fuel level.

Despite that, it makes sense to eschew state where possible because state makes it harder to reason

about programs. Once a language has mutable entities, it becomes necessary to talk about the program

before each mutation happens and similarly after that mutation (i.e., the different “states” of the program).