The functions we have developed so far fall into two broad categories. On one hand, we have the category of functions that encapsulate domain knowledge. On the other hand, we have functions that consume structured data. These functions typically decompose their arguments into their immediate structural components and then process those components. If one of the immediate components belongs to the same class of data as the input, the function is recursive. For that reason, we refer to these functions as (STRUCTURALLY) RECURSIVE FUNCTIONS. In some cases, however, we also need functions based on a different form of recursion, namely, generative recursion. The study of this form of recursion is as old as mathematics and is often called the study of ALGORITHMS.
The inputs of an algorithm represent a problem. Except for rare occasions, the problem is an instance of a large class of problems and the algorithm works for all of these problems. In general, an algorithm partitions a problem into other, smaller problems and solves those. For example, an algorithm for planning a vacation trip requires arrangements for a trip from our home to a nearby airport, a flight to an airport near our vacation spot, and a trip from that airport to our vacation hotel. The entire problem is solved by combining the solutions for these problems.
Designing an algorithm distinguishes two kinds of problems: those that are TRIVIALLY SOLVABLE54 and those that are not. If a given problem is trivially solvable, an algorithm produces the matching solution. For example, the problem of getting from our home to a nearby airport might be trivially solvable. We can drive there, take a cab, or ask a friend to drop us off. If not, the algorithm generates a new problem and solves those new problems. A multistage trip is an example of a problem that is non-trivial and can be solved by generating new, smaller problems. In a computational setting one of the smaller problems often belongs to the same class of problems as the original one, and it is for this reason that we call the approach GENERATIVE RECURSION.
In this part of the book, we study the design of algorithms, that is, functions based on generative recursion. From the description of the idea, we know that this process is much more of an ad hoc activity than the data-driven design of structurally recursive functions. Indeed, it is almost better to call it inventing an algorithm than designing one. Inventing an algorithm requires a new insight -- a ``eureka.'' Sometimes very little insight is required. For example, solving a ``problem'' might just require the enumeration of a series of numbers. At other times, however, it may rely on a mathematical theorem concerning numbers. Or, it may exploit a series of mathematical results on systems of equations. To acquire a good understanding of the design process, it is necessary to study examples and to develop a sense for the various classes of examples. In practice, new complex algorithms are often developed by mathematicians and mathematical computer scientists; programmers, though, must throughly understand the underlying ideas so that they can invent the simple algorithms on their own and communicate with scientists about the others.
The two subsections illustrate two vastly different algorithms. The first one is an example of something programmers invent on a daily basis. The second one describes a fast sorting algorithm, one of the first applications of generative recursion in computing.
Terminology: Mathematical computer scientists often do not distinguish between structural recursion and generative recursion and refer to both kinds of functions as algorithms. Instead they use the terminology of RECURSIVE and ITERATIVE methods. The latter refers to a subclass of function definitions whose recursive function applications are in a particular position in the definition. We will strictly adhere to the terminology of algorithm and generative recursion when we work with this class of functions because this classification matches our thinking of design recipes better than the purely syntactic classification of applied mathematicians.
Let's consider the simple-looking problem of modeling the moves of a ball across a table. Assume the ball rolls at a constant speed until it drops off the table. We can model the table with a canvas of some fixed width and height. The ball is a disk that moves across the canvas, which we express with drawing the disk, waiting, and clearing it, until it is out of bounds.
Figure 66 collects the function, structure, data, and variable definitions that model the ball:
A ball is a structure with four fields: the current position and the
velocity in each direction. That is, the first two numbers in a
ball
structure are the current position on the canvas, and the
next two numbers describe how far the ball moves in the two directions per
step.
The function move-ball
models the physical movement of the
ball. It consumes a ball and creates a new one, modeling one step.
The function draw-and-clear
draws the ball at its current
position, then waits for a short time, and clears it again.
The variable definitions specify the dimensions of the canvas and the delay time.
To move the ball a few times we can write
(define the-ball (make-ball 10 20 -5 +17)) (and (draw-and-clear the-ball) (and (draw-and-clear (move-ball the-ball)) ...))
though this gets tedious after a while. We should instead develop a function that moves the ball until it is out of bounds.
The easy part is to define out-of-bounds?
, a function that
determines whether a given ball is still visible on the canvas:
;;out-of-bounds? : a-ball -> boolean
;; to determine whethera-ball
is outside of the bounds ;; domain knowledge, geometry (define (out-of-bounds? a-ball) (not (and (<= 0 (ball-x a-ball) WIDTH) (<= 0 (ball-y a-ball) HEIGHT))))
We have defined numerous functions like out-of-bounds?
in the
first few sections of the book.
In contrast, writing a function that draws the ball on the canvas until it is out of bounds belongs to a group of programs that we haven't encountered thus far. Let's start with the basics of the function:
;; move-until-out : a-ball -> true
;; to model the movement of a ball until it goes out of bounds
(define (move-until-out a-ball) ...)
Because the function consumes a ball and draws its movement on a canvas, it
produces true
like all other functions that draw onto a canvas.
Designing it with the recipe for structures makes no sense, however. After
all, it is already clear how to draw-and-clear
the ball and how to
move it, too. What is needed instead is a case distinction that checks
whether the ball is out of bounds or not.
Let us refine the function header with an appropriate cond-expression:
(define (move-until-out a-ball) (cond [(out-of-bounds? a-ball) ...] [else ...]))
We have already defined the function out-of-bounds?
because it was
clear from the problem description that ``being out of bounds'' was a
separate concept.
If the ball consumed by move-until-out
is outside of the canvas's
boundaries, the function can produce true
, following the
contract. If the ball is still inside the boundaries, two things must
happen. First, the ball must be drawn and cleared from the canvas. Second,
the ball must be moved, and then we must do things all over again. This
implies that after moving the ball, we apply move-until-out
again,
which means the function is recursive:
;; move-until-out : a-ball -> true
;; to model the movement of a ball until it goes out of bounds
(define (move-until-out a-ball)
(cond
[(out-of-bounds? a-ball) true]
[else (and (draw-and-clear a-ball)
(move-until-out (move-ball a-ball)))]))
Both (draw-and-clear a-ball)
and (move-until-out
(move-ball a-ball))
produce true, and both expressions must be
evaluated. So we combine them with an and-expression.
We can now test the function as follows:
(start WIDTH HEIGHT) (move-until-out (make-ball 10 20 -5 +17)) (stop)
This creates a canvas of proper size and a ball that moves left and down.
A close look at the function definition reveals two peculiarities. First,
although the function is recursive, its body consists of a
cond-expression whose conditions have nothing to do with the input
data. Second, the recursive application in the body does not consume a part
of the input. Instead, move-until-out
generates an entirely new
and different ball
structure, which represents the original ball
after one step, and uses it for the recursion. Clearly, none of our design
recipes could possibly produce such a definition. We have encountered a new
way of programming.
Exercise 25.1.1. What happens if we place the following three expressions
(start WIDTH HEIGHT) (move-until-out (make-ball 10 20 0 0)) (stop)
at the bottom of the Definitions window and click Execute? Does the second expression ever produce a value so that the third expression is evaluated and the canvas disappears? Could this happen with any of the functions designed according to our old recipes? Solution
Exercise 25.1.2.
Develop move-balls
. The function consumes a list of balls and
moves each one until all of them have moved out of bounds.
Hint: It is best to write this function using filter
,
andmap
, and similar abstract functions from
part IV.
Solution
Hoare's quicksort algorithm is the classic example of generative recursion
in computing. Like sort
in section 12.2,
qsort
is a function that consumes a list of numbers and
produces a version that contains the same numbers in ascending order. The
difference between the two functions is that sort
is based on
structural recursion
and qsort
is based on generative
recursion.
The underlying idea of the generative step is a time-honored strategy:
divide and conquer.
That is, we divide the non-trivial instances of the
problem into two smaller, related problems, solve those smaller problems,
and combine their solutions into a solution for the original problem. In
the case of qsort
, the intermediate goal is to divide the list
of numbers into two lists: one that contains all the items that are
strictly smaller than the first item, and another one with all those items
that are strictly larger than the first item. Then the two smaller lists
are sorted using the same procedure. Once the two lists are sorted, we
simply juxtapose the pieces. Owing to its special role, the first item on
the list is often called the pivot item.
|
To develop a better understanding of the process, let's walk through one step of the evaluation by hand. Suppose the input is
(list 11 8 14 7)
The pivot item is 11
. Partioning the list into items larger and
smaller than 11
produces two lists:
(list 8 7)
and
(list 14)
The second one is already sorted in ascending order; sorting the first one
produces (list 7 8)
. This leaves us with three pieces from the
original list:
(list 7 8)
, the sorted version of the list with the smaller numbers;
11
;
and
(list 14)
, the sorted version of the list with the larger
numbers.
To produce a sorted version of the original list, we concatenate the three
pieces, which yields the desired result: (list 7 8 11 14)
.
Our illustration leaves open how qsort
knows when to
stop. Since it is a function based on generative recursion, the general
answer is that it stops when the sorting problem has become
trivial. Clearly, empty
is one trivial input for
qsort
, because the only sorted version of it is
empty
. For now, this answer suffices; we will return to this
question in the next section.
Figure 67 provides a tabular overview of the entire sorting
process for (list 11 8 14 7)
. Each box has three compartments:
| |||||||||
| |||||||||
|
The top compartment shows the list that we wish to sort, and the bottommost contains the result. The three columns in the middle display the sorting process for the two partitions and the pivot item.
Exercise 25.2.1.
Simulate all qsort
steps for (list 11 9 2 18 12 14 4 1)
.
Solution
Now that we have a good understanding of the generative step, we can
translate the process description into Scheme. The description suggests that
qsort
distinguishes two cases. If the input is empty
,
it produces empty
. Otherwise, it performs a generative recursion.
This case-split suggests a cond-expression:
;;quick-sort : (listof number) -> (listof number)
;; to create a list of numbers with the same numbers as ;;alon
sorted in ascending order (define (quick-sort alon) (cond [(empty? alon) empty] [else ...]))
The answer for the first case is given. For the second case, when
qsort
's input is non-empty
, the algorithm uses the
first item to partition the rest of the list into two sublists: a list with
all items smaller than the pivot item and another one with those larger
than the pivot item.
Since the rest of the list is of unknown size, we leave the task of
partitioning the list to two auxiliary functions: smaller-items
and larger-items
. They process the list and filter out those items
that are smaller and larger, respectively, than the first one. Hence each
auxiliary function accepts two arguments, namely, a list of numbers and a
number. Developing these functions is, of course, an exercise in structural
recursion;
their definitions are shown in figure 68.
|
Each sublist is sorted separately, using quick-sort
. This implies
the use of recursion and, more specifically, the following two expressions:
(quick-sort (smaller-items alon (first alon)))
,
which sorts the list of items smaller than the pivot; and
(quick-sort (larger-items alon (first alon)))
,
which sorts the list of items larger than the pivot.
Once we get the sorted versions of the two lists, we need a function that
combines the two lists and the pivot item. Scheme's append
function accomplishes this:
(append (quick-sort (smaller-items alon (first alon))) (list (first alon)) (quick-sort (larger-items alon (first alon))))
Clearly, all items in list 1 are smaller than the pivot and the pivot is
smaller than all items in list 2, so the result is a sorted list.
Figure 68 contains the full function. It includes the
definition of quick-sort
, smaller-items
, and
larger-items
.
Let's take a look at the beginning of a sample hand evaluation:
(quick-sort (list 11 8 14 7)) = (append (quick-sort (list 8 7)) (list 11) (quick-sort (list 14)))
= (append (append (quick-sort (list 7)) (list 8) (quick-sort empty)) (list 11) (quick-sort (list 14)))
= (append (append (append (quick-sort empty) (list 7) (quick-sort empty)) (list 8) (quick-sort empty)) (list 11) (quick-sort (list 14)))
= (append (append (append empty (list 7) empty) (list 8) empty) (list 11) (quick-sort (list 14)))
= (append (append (list 7) (list 8) empty) (list 11) (quick-sort (list 14))) = ...
The calculation shows the essential steps of the sorting process, that is,
the partitioning steps, the recursive sorting steps, and the concatenation
of the three parts. From this calculation, we can see that
quick-sort
implements the process illustrated in
figure 67.
Exercise 25.2.2. Complete the above hand-evaluation.
The hand-evaluation of (quick-sort (list 11 8 14 7))
suggests an
additional trivial case for quick-sort
. Every time
quick-sort
consumes a list of one item, it produces the very same
list. After all, the sorted version of a list of one item is the list
itself.
Modify the definition of quick-sort
to take advantage of this
observation.
Hand-evaluate the same example again. How many steps does the extended algorithm save? Solution
Exercise 25.2.3.
While quick-sort
quickly reduces the size of the problem in many
cases, it is inappropriately slow for small problems. Hence people often
use quick-sort
to reduce the size of the problem and switch to a
different sort function when the list is small enough.
Develop a version of quick-sort
that uses sort
from
section 12.2 if the length of the input is below some
threshold.
Solution
Exercise 25.2.4.
If the input to quick-sort
contains the same number several times,
the algorithm returns a list that is strictly shorter than the input. Why?
Fix the problem so that the output is as long as the input.
Solution
Exercise 25.2.5.
Use the filter
function to define smaller-items
and
larger-items
as one-liners.
Solution
Exercise 25.2.6.
Develop a variant of quick-sort
that uses only one comparison
function, say, <
. Its partitioning step divides the given list
alon
into a list that contains the items of alon
smaller
than (first alon)
and another one with those that are not smaller.
Use local
to combine the functions into a single function.
Then abstract the new version to consume a list and a comparison function:
;; general-quick-sort : (X X -> bool) (list X) -> (list X)
(define (general-quick-sort a-predicate a-list) ...)