Thus far we have approached Scheme as if it were a spoken language. Like toddlers, we learned the vocabulary of the language, we acquired an intuitive understanding of its meaning, and we figured out some basic rules of how to compose and not to compose sentences. Truly effective communication, however, in any language -- be it natural like English or artificial like Scheme -- eventually requires a formal study of its vocabulary, its grammar, and the meaning of sentences.
A programming language is in many ways like a spoken language. It has a vocabulary and a grammar. The vocabulary is the collection of those ``basic words'' from which we can compose ``sentences'' in our language. A sentence in a programming language is an expression or a function; the language's grammar dictates how to form complete sentences from words. Programmers use the terminology SYNTAX to refer to the vocabularies and grammars of programming languages.
Not all grammatical sentences are meaningful -- neither in English nor in a programming language. For example, the English sentence ``the cat is round'' is a meaningful sentence, but ``the brick is a car'' makes no sense, even though it is completely grammatical. To determine whether or not a sentence is meaningful, we must study the MEANING, or SEMANTICS, of words and sentences. For spoken languages, we typically explain the meaning of words with sentences that use simpler words; in the case of a foreign language, we sometimes explain a word with simple sentences in the foreign language or we translate words to a known language. For programming languages, there are also several ways to explain the meaning of individual sentences. In this book, we discuss the meaning of Scheme programs through an extension of the familiar laws of arithmetic and algebra. After all, computation starts with this form of simple mathematics, and we should understand the connection between this mathematics and computing.
The first three sections present the vocabulary, grammar, and meaning of a
small, but powerful subset of Scheme. The fourth one resumes our discussion
of run-time errors in Scheme, based on our new understanding of its
meaning. The remaining three sections revisit and
and or
expressions, variable definitions, and structures.
Scheme's basic vocabulary consists of five categories of words. The five lines in figure 20 show how computer scientists discuss the vocabulary of a language.27 All lines employ the same notation. They enumerate some simple examples separated by a bar (`` | ''). Dots indicate that there are more things of the same kind in some category.
|
The first category is that of variables, which are the names of functions and values. The second introduces constants: boolean, symbolic, and numeric constants. As indicated before, Scheme has a powerful number system, which is best introduced gradually by examples. The final category is that of primitive operations, which are those basic functions that Scheme provides from the very beginning. While it is possible to specify this collection in its entirety, it is best introduced in pieces, as the need arises.
For the classification of Scheme sentences, we also need three
keywords
: define
, cond
, and else
.
These keywords have no meaning. Their role resembles that of punctuation
marks, especially that of commas and semicolons, in English writing; they
help programmers distinguish one sentence from another. No keyword may be
used as a variable.
Grammar, Layout, and Editing |
In contrast to many other programming languages, Scheme has a simple grammar. It is shown in its entirety in figure 21.28 The grammar defines two categories of sentences: Scheme definitions, <def>, and expressions, <exp>. While the grammar does not dictate the use of white space between the items of sentences, we follow the convention to put at least one blank space behind each item unless an item is followed by a right parenthesis ``)''. Scheme is flexible concerning blank space, and we can replace a single blank space by many spaces, line breaks, and page breaks.
The two grammar definitions describe how to form atomic sentences and
compound sentences, which are sentences built from other sentences. For
example, a function definition
is formed by using ``('', followed by the
keyword define
, followed by another ``('', followed by a non-empty
sequence of variables, followed by ``)'', followed by an expression, and
closed by a right parenthesis ``)'' that matches the very first one. The
keyword define
distinguishes definitions from expressions.
The category of expressions consists of six alternatives: variables,
constants, primitive applications,
(function) applications,
and two varieties of cond
itionals. The last four are again composed of
other expressions. The keyword cond
distinguishes conditional
expressions from primitive and function applications.
Here are three examples of expressions: 'all
, x
, and
(x x)
. The first one belongs to the class of symbols and is
therefore an expression. The second is a variable, and every variable is an
expression. Finally, the third is a function application, because x
is a variable.
In contrast, the following parenthesized sentences are not expressions:
(f define)
, (cond x)
, and ()
. The first one
partially matches the shape of a function application but it uses
define
as if it were a variable. The second one fails to be a
correct cond-expression because it contains a variable as the
second item and not a pair of expressions surrounded by parentheses. The
last one is just a pair of parentheses, but the grammar requires that every
left parenthesis is followed by something other than a right parenthesis.
Exercise 8.2.1.
Why are the sentences
1. x | 2. (= y z) | 3. (= (= y z) 0)
|
Explain why the following sentences are illegal expressions:
1. (3 + 4) | 2. empty?(l) | 3. (x)
Solution |
Exercise 8.2.2.
Why are the sentences
1. (define (f x) x) | 2. (define (f x) y) | 3. (define (f x y) 3)
|
Explain why the following sentences are illegal definitions:
1. (define (f 'x) x) | 2. (define (f x y z) (x)) | 3. (define (f) 10)
Solution |
Exercise 8.2.3.
Distinguish syntactically legal from illegal sentences:
1. (x) | 2. (+ 1 (not x)) | 3. (+ 1 2 3)
|
Exercise 8.2.4.
Distinguish syntactically legal from illegal sentences:
1. (define (f x) 'x) | 2. (define (f 'x) x) | 3. (define (f x y) (+ 'y (not x)))
|
Grammatical Terminology: The components of compound sentences have names. We have introduced some of these names on an informal basis; for better communication, we introduce all useful ones here. The second component of a definition, that is, the non-empty sequence of variables, is a HEADER. Accordingly, the expression component of a definition is called BODY. The variables that follow the first variable in a header are the PARAMETERS of a function.
|
People who think of definition as the definition of a mathematical function also use the terminology LEFT-HAND SIDE for a definition's header and RIGHT-HAND SIDE for the body. For the same reason, the first component in an application is called FUNCTION and the remaining components are referred to as ARGUMENTS. Occasionally, we also use ACTUAL ARGUMENTS.
Finally, a cond-expression
consists of cond
-lines or
cond
-clauses. Each line consists of two expressions: the
QUESTION
and the ANSWER.
A question is also called a CONDITION.
Figure 22 provides a summary of the conventions.
Stepper |
A legal DrScheme program consists of two items: a sequence of function definitions (in the Definitions window) and a sequence of interactions (in the Interactions window). Each interaction is a demand for the evaluation of one Scheme expression, which typically refers to the functions defined in the upper part of DrScheme.
When DrScheme evaluates an expression, it uses nothing but the laws of arithmetic and algebra to convert an expression into a value. In ordinary mathematics courses, values are just numbers. We also include symbols, booleans, and indeed all constants:
|
Now that we have defined the set of values, it is easy to introduce and to explain the evaluation rules. The rules come in two categories: those that appeal to arithmetic knowledge and those that rely on a small amount of algebra. First, we need an infinite number of rules like those of arithmetic to evaluate applications of primitives:
(+ 1 1) = 2 (- 2 1) = 1
But Scheme ``arithmetic'' is more general than just number crunching. It also includes rules for dealing with boolean values, symbols, and lists like these:
(not true) = false (symbol=? 'a 'b) = false (symbol=? 'a 'a) = true
Second, we need one rule from algebra to understand how the application of a user-defined function advances computation. Suppose the Definitions window contains the definition
(define (f x-1 ... x-n) exp)
and f, x-1, ..., x-n
are variables and exp
is some
(legal) expression. Then an application of a function is governed by the
law:
|
v-1 ... v-n
is a sequence of values that is as long as
x-1 ... x-n
.This rule is as general as possible, so it is best to look at a concrete example. Say the definition is
(define (poly x y) (+ (expt 2 x) y))
Then the application (poly 3 5)
can be evaluated as follows:
(poly 3 5) = (+ (expt 2 3) 5)) ;; This line is(+ (expt 2 x) y)
wherex
was replaced by3
andy
by5
. = (+ 8 5) = 13
These last two steps follow from plain arithmetic.
Third and finally, we need some rules that help us determine the value of cond-expressions. These rules are algebraic rules but are not a part of the standard algebra curriculum:
false
:
(cond [false ...] [exp1 exp2] ...)
= (cond ; The first line disappeared. [exp1 exp2] ...)
then the first cond
-line disappears;
true
:
(cond [true exp] ...)
= exp
the entire cond-expressions is replaced by the first answer;
else
-line:
(cond [else exp])
= exp
the cond-expressions is replaced by the answer in the
else
-clause.
No other rules are needed to understand cond
.
Consider the following evaluation:
(cond [false 1] [true (+ 1 1)] [else 3]) = (cond [true (+ 1 1)] [else 3]) = (+ 1 1) = 2
It first eliminates a cond
-line and then equates the
cond-expression with (+ 1 1)
. The rest is plain
arithmetic again.
The rules are equations of the form that we use in arithmetic and algebra
on a daily basis.
Indeed, the same laws apply to this system of equations
as to those in mathematics. For example, if a = b
and b = c
, then we also know that a = c
. A consequence is
that as we get better at hand-evaluations, we can skip obvious steps and
combine several equational inferences into one. Here is one shorter version
of the previous evaluation:
(cond [false 1] [true (+ 1 1)] [else 3]) = (+ 1 1) = 2
Even more importantly, we can replace any expression by its equal in every context -- just as in algebra. Here is a another cond-expression and its evaluation:
(cond [(= 1 0)
0] [else (+ 1 1)]) ;; The underlined expression is evaluated first. = (cond [false 0] [else (+ 1 1)]) ;; Herecond
_false
applies. = (cond [else (+ 1 1)]) ;; Usingcond
_else
, we now get an arithmetic expression. = (+ 1 1) = 2
For the first step, we evaluated the nested, underlined expression,
which is clearly essential here, because no cond
rule would apply
otherwise. Of course, there is nothing unusual about this kind of
computing. We have done this many times in algebra and in the first few
sections of this book.
Exercise 8.3.1. Evaluate the following expressions step by step:
1. (+ (* (/ 12 8) 2/3) (- 20 (sqrt 4))) 2. (cond [(= 0 0) false] [(> 0 1) (symbol=? 'a 'a)] [else (= (/ 1 0) 9)]) 3. (cond [(= 2 0) false] [(> 2 1) (symbol=? 'a 'a)] [else (= (/ 1 2) 9)]) ; Solution
Exercise 8.3.2. Suppose the Definitions window contains
;; f : number number -> number
(define (f x y)
(+ (* 3 x) (* y y)))
Show how DrScheme evaluates the following expressions, step by step:
1. (+ (f 1 2) (f 2 1)) 2. (f 1 (* 2 3)) 3. (f (f 1 (* 2 3)) 19) ; Solution
Parenthesized sentences may or may not belong to Scheme, depending on whether or not they are legal according to the grammar in figure 21. If DrScheme verifies that a sentence does not belong to the language dubbed Beginning Student, it signals a SYNTAX ERROR.
The remaining expressions are syntactically legal, but some of those may
still pose problems for our evaluation rules. We say that such legal
expressions contain LOGICAL ERRORS
or RUN-TIME ERRORS.
Consider the simplest example: (/ 1 0)
. We already know from
mathematics that
does not have a value. Clearly, since Scheme's
calculations must be consistent with mathematics, it too must not equate
(/ 1 0)
with a value.
In general, if an expression is not a value and if the evaluation rules
allow no further simplification, we say that an error occurred or that the
function raises an error signal. Pragmatically this means that the
evaluation stops immediately with an appropriate error message, such as
"/: divide by zero"
for division by zero.
For an example, consider the following evaluation:
(+ (* 20 2) (/ 1 (- 10 10))) = (+ 40 (/ 1 0)) = /: divide by zero
The error eliminates the context (+ 40 ...)
around (/ 1 0)
,
which represents the remainder of the computation with respect to the division.
To understand how run-time errors are signaled, we must inspect the evaluation rules again. Consider the function
;; my-divide : number -> number
(define (my-divide n)
(cond
[(= n 0) 'inf]
[else (/ 1 n)]))
Now suppose we apply my-divide
to 0
. Then the first step is:
(my-divide 0)
= (cond
[(= 0 0) 'inf]
[else (/ 1 0)
])
It would obviously be wrong to say that the function signals the error ``/:
divide by zero'' now, even though an evaluation of the underlined
subexpression would demand it. After all, (= 0 0)
is
true
and therefore the application has a proper result:
(my-divide 0) = (cond [(= 0 0) 'inf] [else(/ 1 0)
]) = (cond [true 'inf] [else(/ 1 0)
]) = 'inf
Fortunately, our laws of evaluation take care of these situations automatically. We just need to keep in mind when the laws apply. For example, in
(+ (* 20 2) (/ 20 2))
the addition cannot take place before the multiplication or division. Similarly, the underlined division in
(cond
[(= 0 0) 'inf]
[else (/ 1 0)
])
cannot be evaluated until the corresponding line is the first condition in the cond-expression.
As a rule of thumb, it is best to keep the following in mind:
Guideline on Expression Evaluation |
Simplify the outermost (and left-most) subexpression that is ready for evaluation. |
While this guideline is a simplification, it always explains Scheme's results.
In some cases, programmers also want to define functions that raise errors.
Recall the checked version of area-of-disk
from
section 6:
;;checked-area-of-disk : Scheme-value -> boolean
;; to compute the area of a disk with radiusv
, ifv
is a number (define (checked-area-of-disk v) (cond [(number? v) (area-of-disk v)] [else (error 'checked-area-of-disk "number expected")]))
If we were to apply checked-area-of-disk
to a symbol, we would get
the following evaluation:
(- (checked-area-of-disk 'a) (checked-area-of-disk 10)) = (- (cond [(number? 'a) (area-of-disk 'a)] [else (error 'checked-area-of-disk "number expected")]) (checked-area-of-disk 10)) = (- (cond [false (area-of-disk 'a)] [else (error 'checked-area-of-disk "number expected")]) (checked-area-of-disk 10)) = (- (error 'checked-area-of-disk "number expected") (checked-area-of-disk 10)) = checked-area-of-disk : number expected
In other words, when we evaluate the error
expression, we
proceed as if we had encountered a division by zero.
Our current definition of the Beginning Student Scheme language omits two
forms of expressions: and
and or
expressions.
Adding them
provides a case study of how to study new language construct. We must
first understand their syntax, then their semantics, and finally their
pragmatics.
Here is the revised grammar:
|
and
and or
are keywords, each
followed by two expressions. At first glance, the two look like
(primitive or function) applications. To understand why they are not, we
must look at the pragmatics of these expressions first.
Suppose we need to formulate a condition that determines whether the
n
-th fraction of 1
is m
:
(and (not (= n 0)) (= (/ 1 n) m))
We formulate the condition as an and
combination of two boolean
expressions, because we don't wish to divide by 0
accidentally.
Next, assume n
becomes 0
during the course of the
evaluation. Then the expression becomes
(and (not (= 0 0)) (= (/ 1 0) m))
Now, if and
were an ordinary operation, we would have to evaluate
both subexpressions, which would trigger an error in the second one. For
this reason, and
is not a primitive operation,
but a special
expression. In short, we use and
and or
expressions to
combine boolean computations that may have to short-cut an evaluation.
Once we understand how and
and or
expressions should be
evaluated, it is easy to formulate matching rules. Better still, we can
formulate expressions in our first language that are equivalent to these
expressions:
(and <exp-1> <exp-2>) = (cond [<exp-1> <exp-2>] [else false])
and
(or <exp-1> <exp-2>) = (cond [<exp-1> true] [else <exp-2>])
These equivalences simplify what actually takes place in DrScheme but they are a perfectly appropriate model for now.
Programs consist not only of function definitions but also variable definitions, but these weren't included in our first grammar.
Here is the grammar rule for variable definitions:
|
define
, followed by a variable, followed by an expression, and
closed by a right parenthesis ``)'' that matches the very first one. The
keyword define
distinguishes variable definitions from
expressions, but not from function definitions. For that, a reader must look
at the second component of the definition.Next we must understand what a variable definition means. A variable definition like
(define RADIUS 5)
has a plain meaning. It says that wherever we encounter RADIUS
during an evaluation, we may replace it with 5
.
When DrScheme encounters a definition with a proper expression on the right-hand side, it must evaluate that expression immediately. For example, the right-hand side of the definition
(define DIAMETER (* 2 RADIUS))
is the expression (* 2 RADIUS)
. Its value is 10
because
RADIUS
stands for 5
. Hence we can act as if we had written
(define DIAMETER 10)
In short, when DrScheme encounters a variable definition, it determines the
value of the right-hand side. For that step, it uses all those definitions
that precede the current definition but not those that follow. Once
DrScheme has a value for the right-hand side, it remembers that the name on
the left-hand side stands for this value. Whenever we evaluate an
expression, every occurrence of the define
d variable is replaced
by its value.
Consider the sequence of definitions in figure 23.
As DrScheme steps through this sequence of definitions, it first determines
that RADIUS
stands for 10
, DIAMETER
for
20
, and area is the name of a function. Finally, it evaluates
(area RADIUS)
to 314.0
and associates
AREA-OF-RADIUS
with that value.
Exercise 8.6.1. Make up five examples of variable definitions. Use constants and expressions on the right-hand side. Solution
Exercise 8.6.2. Evaluate the following sequence of definitions
(define RADIUS 10) (define DIAMETER (* 2 RADIUS)) (define CIRCUMFERENCE (* 3.14 DIAMETER))
by hand. Solution
Exercise 8.6.3. Evaluate the following sequence of definitions
(define PRICE 5) (define SALES-TAX (* .08 PRICE)) (define TOTAL (+ PRICE SALES-TAX))
by hand. Solution
We still have to understand the syntax and semantics of one more Scheme
construct: define-struct
. When we define a structure, we really
define several primitive operations: a constructor,
several selectors, and
a predicate.
Hence, define-struct
is by far the most complex
Scheme construct we use.
A structure definition is a third form of
definition. The keyword define-struct
distinguishes this form of
definition from function and variable definitions. The keyword is followed
by a name and a sequence of names enclosed in parentheses:
<def> =
(define-struct <var0> (<var-1> ... <var-n>)) .
|
define-struct
definition must be chosen as if they
were function names, though none of them is used as a
function (or variable) name.
(define-struct point (x y z)) .
Since point
, x
, y
, and z
are variables and
the parentheses are placed according to the grammatical pattern, it is a
proper definition of a structure. In contrast, these two parenthesized sentences
(define-struct (point x y z)) (define-struct point x y z)
are improper definitions, because define-struct
is not followed
by a single variable name and a sequence of variables in parentheses.
A define-struct
definition introduces new primitive
operations.
The names of these operations are formed from those that occur
in the definition. Suppose a data structure definition has the following
shape:
(define-struct c (s-1 ... s-n))
Then Scheme introduces the following primitive operations:
These primitives have the same status as +
, -
, or
*
. Before we can understand the rules that govern these new
primitives, however, we must return to the definition of values. After
all, the purpose of define-struct
is to introduce a new class of
values: structures.
Simply put, the set of values no longer consists of just constants, but
also of structures, which compound several values into one. In terms of
our grammar of values, we must add one clause per define-struct
:
|
points
structures. Since the list of fields
contains three names, (make-point u v w)
is a value if
u
, v
, and w
are values.
Now we are in a position to understand the evaluation rules of the new
primitives. If c-s-1
is applied to a c
structure, it
returns the first component of the value. Similarly, the second selector
extracts the second component, the third selector the third component, and
so on. The relationship between the new data constructor and the selectors
is best characterized with n equations:
(c-s-1 (make-c V-1 ... V-n)) = V-1 (c-s-n (make-c V-1 ... V-n)) = V-n
where V-1 ... V-n
is a sequence of values that is as long
as s-1 ... s-n
.
For our running example, we get the equations
(point-x (make-point V U W)) = V (point-y (make-point V U W)) = U (point-z (make-point V U W)) = W
In particular, (point-y (make-point 3 4 5))
is equal to
4
, and (point-x (make-point (make-point 1 2 3) 4 5))
evaluates to (make-point 1 2 3)
because the latter is also a value.
The predicate c?
can be applied to any value. It returns
true
if the value is of kind c
and false
otherwise.
We can translate both parts into equations. The first one,
(c? (make-c V-1 ... V-n)) = true ,
relates c?
and values constructed with make-c
; the second one,
(c? V) = false; ifV
is a value not constructed withmake-c
,
relates c?
to all other values.
Again, the equations are best understood in terms of our example. Here are the general equations:
(point? (make-point V U W)) = true (point? U) = false ; ifU
is value, but not apoint
structure.
Thus, (point? (make-point 3 4 5))
is true
and
(point? 3)
is false
.
Exercise 8.7.1. Distinguish legal from illegal sentences:
(define-struct personnel-record (name salary dob ssn))
(define-struct oops ())
(define-struct child (dob date (- date dob)))
(define-struct (child person) (dob date))
(define-struct child (parents dob date))
Explain why the sentences are legal define-struct
definitions or how they fail to be legal define-struct
definitions.
Solution
Exercise 8.7.2. Which of the following are values?
(make-point 1 2 3)
(make-point (make-point 1 2 3) 4 5)
(make-point (+ 1 2) 3 4)
Solution
Exercise 8.7.3. Suppose the Definitions window contains
(define-struct ball (x y speed-x speed-y))
Determine the results of the following expressions:
(number? (make-ball 1 2 3 4))
(ball-speed-y (make-ball (+ 1 2) (+ 3 3) 2 3))
(ball-y (make-ball (+ 1 2) (+ 3 3) 2 3))
Also check how DrScheme deals with the following expressions:
(number? (make-ball 1 3 4))
(ball-x (make-posn 1 2))
(ball-speed-y 5)
Verify your solutions with DrScheme. Solution
|
27 We use different fonts to distinguish the words of different categories. Constants and primitive operations are type set in sans serif, variables in italics, and keywords in boldface.
28 This grammar describes only that portion of Scheme we have used so far (minus variable and structure definitions), which still covers a large subset of the full language. Scheme is a bit larger, and we will get to know more of it in the remaining parts of the book.