At first glance, the algorithms
quick-sort have little in common. One processes structures; the
other processes lists. One creates a new structure for the generative step;
the other splits up a list into three pieces and recurs on two of them. In
short, a comparison of the two examples of generative recursion suggests
that the design of algorithms is an ad hoc activity and that it is
impossible to come up with a general design recipe. A closer look, however,
suggests a different picture.
First, even though we speak of algorithms as processes that solve problems, they are still functions that consume and produce data. In other words, we still choose data to represent a problem, and we must definitely understand the nature of our data if we wish to understand the process. Second, we describe the processes in terms of data, for example, ``creating a new structure'' or ``partitioning a list of numbers.'' Third, we always distinguish between input data for which it is trivial to produce a solution and those for which it is not. Fourth, the generation of problems is the key to the design of algorithms. Although the idea of how to generate a new problem might be independent of a data representation, it must certainly be implemented for whatever form of representation we choose for our problem. Finally, once the generated problems have been solved, the solutions must be combined with other values.
Let us examine the six general stages of our structural design recipe in light of our discussion:
move-until-outthe process is trivial and doesn't need more than a few words. For others, including,
quick-sort, the process relies on a non-trivial idea for its generative step, and its explanation requires a good example such as the one in figure 67.
(define (generative-recursive-fun problem) (cond [(trivially-solvable? problem) (determine-solution problem)] [else (combine-solutions ... problem ... (generative-recursive-fun (generate-problem-1 problem)) (generative-recursive-fun (generate-problem-n problem)))]))
What is a trivially solvable problem?
What is a corresponding solution?
How do we generate new problems that are more easily solvable than the original problem? Is there one new problem that we generate or are there several?
Is the solution of the given problem the same as the solution of (one of) the new problems? Or, do we need to combine the solutions to create a solution for the original problem? And, if so, do we need anything from the original problem data?
To define the algorithm, we must express the answers to these four questions in terms of our chosen data representation.
Exercise 26.0.7. Formulate informal answers to the four key questions for the problem of modeling a ball's movement across a canvas until it is out of bounds. Solution
Formulate informal answers to the four key questions for the
quick-sort problem. How many instances of
generate-problem are there?
Unfortunately, the standard recipe is not good enough for the design of algorithms. Up to now, a function has always produced an output for any legitimate input. That is, the evaluation has always stopped. After all, by the nature of our recipe, each natural recursion consumes an immediate piece of the input, not the input itself. Because data is constructed in a hierarchical manner, this means that the input shrinks at every stage. Hence the function sooner or later consumes an atomic piece of data and stops.
With functions based on generative recursion, this is no longer true. The internal recursions don't consume an immediate component of the input but some new piece of data, which is generated from the input. As exercise 25.1.1 shows, this step may produce the input over and over again and thus prevent the evaluation from ever producing a result. We say that the program LOOPS or is in an INFINITE LOOP.
In addition, even the slightest mistake in translating the process
description into a function definition may cause an infinite loop. The
problem is most easily understood with an example. Consider the following
smaller-items, one of the two ``problem
smaller-items : (listof number) number -> (listof number);; to create a list with all those numbers on
alon;; that are smaller than or equal to
threshold(define (smaller-items alon threshold) (cond [(empty? alon) empty] [else (if (<= (first alon) threshold) (cons (first alon) (smaller-items (rest alon) threshold)) (smaller-items (rest alon) threshold))]))
< it employs
<= to compare numbers. As a
result, this function produces
(list 5) when applied to
(list 5) and
Worse, if the
quick-sort function from figure 68 is
combined with this new version of
smaller-items, it doesn't produce
any output for
(quick-sort (list 5)) = (append (quick-sort (smaller-items 5 (list 5))) (list 5) (quick-sort (larger-items 5 (list 5)))) = (append
(quick-sort (list 5))(list 5) (quick-sort (larger-items 5 (list 5))))
The first recursive use demands that
quick-sort solve the problem
(list 5) -- but that is the exact problem that we
started with. Since this is a circular evaluation,
(list 5)) never produces a result. More generally, there is no guarantee
that the size of the input for a recursive call brings us closer to a
solution than the original input.
The lesson from this example is that the design of algorithms requires one
more step in our design recipe: a TERMINATION ARGUMENT,
explains why the process produces an output for every input and how the
function implements this idea; or a warning, which
explains when the process may not terminate.
quick-sort, the argument might look like this:
At each step,Without such an argument an algorithm must be considered incomplete.
quick-sortpartitions the list into two sublists using
larger-items. Each function produces a list that is smaller than the input (the second argument), even if the threshold (the first argument) is an item on the list. Hence each recursive application of
quick-sortconsumes a strictly shorter list than the given one. Eventually,
quick-sortreceives and returns
A good termination argument may on occasion also reveal additional
termination cases. For example,
(smaller-items N (list N)) and
(larger-items N (list N)) always produce
empty for any
N. Therefore we know that
quick-sort's answer for
(list N) is
(list N).55 To add this
quick-sort, we simply add a
(define (quick-sort alon) (cond [(empty? alon) empty] [(empty? (rest alon)) alon] [else (append (quick-sort (smaller-items alon (first alon))) (list (first alon)) (quick-sort (larger-items alon (first alon))))]))
(empty? (rest alon)) is one way to ask
alon contains one item.
Figure 69 summarizes the suggestions on the design of algorithms. The dots indicate that the design of an algorithm requires a new step: the termination argument. Read the table in conjunction with those of the preceding chapters.
Define the function
tabulate-div, which accepts a number
and tabulates the list of all of its divisors, starting with
1 and ending in
n. A number
d is a divisior of a
n if the remainder of dividing
0, that is,
(= (remainder n d) 0) is true. The smallest
divisior of any number is
1; the largest one is the number
Develop the function
merge-sort, which sorts a list of numbers in
ascending order, using the following two auxiliary functions:
The first one,
make-singles, constructs a list of
one-item lists from the given list of numbers. For example,
(equal? (make-singles (list 2 5 9 3)) (list (list 2) (list 5) (list 9) (list 3)))
The second one,
merge-all-neighbors, merges pairs of neighboring
lists. More specifically, it consumes a list of lists (of numbers) and
merges neighbors. For example,
(equal? (merge-all-neighbors (list (list 2) (list 5) (list 9) (list 3))) (list (list 2 5) (list 3 9))) (equal? (merge-all-neighbors (list (list 2 5) (list 3 9))) (list (list 2 3 5 9)))
In general, this function yields a list that is approximately half as long as the input. Why is the output not always half as long as the input?
Make sure to develop the functions independently.
merge-sort first uses
make-singles to create
a list of single lists; then it relies on
shorten the list of lists until it contains a single list. The latter is
The template for algorithms is so general that it even covers functions based on structural recursion. Consider the version with one termination clause and one generation step:
(define (generative-recursive-fun problem) (cond [(trivially-solvable? problem) (determine-solution problem)] [else (combine-solutions problem (generative-recursive-fun (generate-problem problem)))]))
If we replace
rest, the outline
template for a list-processing function:
(define (generative-recursive-fun problem) (cond [(empty? problem) (determine-solution problem)] [else (combine-solutions problem (generative-recursive-fun (rest problem)))]))
combine-solutions so that
generative-recursive-fun computes the length of its
This discussion raises the question of whether there is a difference between between functions based on structural recursion and those based on generative recursion. The answer is ``it depends.'' Of course, we could say that all functions using structural recursion are just special cases of generative recursion. This ``everything is equal'' attitude, however, is of no help if we wish to understand the process of designing functions. It confuses two classes of functions that are designed with different approaches and that have different consequences. One relies on a systematic data analysis and not much more; the other requires a deep, often mathematical, insight into the problem-solving process itself. One leads programmers to naturally terminating functions; the other requires a termination argument.
A simple inspection of a function's definition quickly shows whether a
function uses structural or generative recursion. All self-applications of
a structurally recursive function always receive an immediate component of
the current input for further processing. For example, for a
constructed list, the immediate components are the
item and the
rest of the list. Hence, if a function consumes a
plain list and its recursive use does not consume the rest of the list,
its definition is not structural but generative. Or, put positively,
properly recursive algorithms consume newly generated input, which may or
may not contain components of the input. In any case, the new piece of data
represents a different problem than the given one, but still a problem of
the same general class of problems.
A user cannot distinguish
consume a list of numbers; both produce a list that consists of the same
numbers arranged in ascending order. To an observer, the functions are
completely equivalent.56 This raises the question of which of the two a
programmer should provide. More generally, if we can develop a function
using structural recursion and an equivalent one using generative
recursion, what should we do?
To understand this choice better, let's discuss another classical example of generative recursion from mathematics: the problem of finding the greatest common divisor of two positive natural numbers.57 All such numbers have at least one divisor in common: 1. On occasion, this is also the only common divisor. For example, 2 and 3 have only 1 as common divisor because 2 and 3, respectively, are the only other divisors. Then again, 6 and 25 are both numbers with several divisors:
6 is evenly divisible by 1, 2, 3, and 6;
25 is evenly divisible by 1, 5, and 25.
Still, the greatest common divisior of 25 and 6 is 1. In contrast, 18 and 24 have many common divisors:
18 is evenly divisible by 1, 2, 3, 6, 9, and 18;
24 is evenly divisible by 1, 2, 3, 4, 6, 8, 12, and 24.
The greatest common divisor is 6.
Following the design recipe, we start with a contract, a purpose statement, and a header:
gcd : N[>= 1] N[>= 1] -> N;; to find the greatest common divisior of
m(define (gcd n m) ...)
The contract specifies the precise inputs: natural
numbers that are greater or equal to
Now we need to make a decision whether we want to pursue a design based on
structural or on generative recursion. Since the answer is by no means
obvious, we develop both. For the structural version, we must consider
which input the function should process:
m, or both.
A moment's consideration suggests that what we really need is a function
that starts with the smaller of the two and outputs the first
number smaller or equal to this input that evenly divides both
local to define an appropriate auxiliary function: see
figure 70. The conditions ``evenly divisible'' have
been encoded as
(= (remainder n i) 0) and
(= (remainder m
i) 0). The two ensure that
without a remainder. Testing
gcd-structural with the examples
shows that it finds the expected answers.
Although the design of
gcd-structural is rather straightforward,
it is also naive. It simply tests for every number whether it divides
m evenly and returns the first such number.
For small natural numbers, this process works just fine. Consider the
following example, however:
(gcd-structural 101135853 45014640)
The result is
177 and to get there
gcd-structural had to
compare 101135676, that is, 101135853 - 177, numbers. This is a large effort
and even reasonably fast computers spend several minutes on this task.
Enter the definition of
gcd-structural into the Definitions
window and evaluate
(time (gcd-structural 101135853 45014640)) in
Hint: After testing
gcd-structural conduct the performance tests
in the Full Scheme language (without debugging), which evaluates expressions
faster than the lower language levels but with less protection. Add
to the top of the Definitions
window. Have some reading handy!
Since mathematicians recognized the inefficiency of the ``structural
algorithm'' a long time ago, they studied the problem of finding divisiors
in more depth. The essential insight is that for two natural numbers
smaller, their greatest common divisor is
equal to the greatest common divisior of
smaller and the
larger divided into
is how we can put this insight into equational form:
(gcd larger smaller) = (gcd smaller (remainder larger smaller))
(remainder larger smaller) is smaller than both
smaller, the right-hand side use of
Here is how this insight applies to our small example:
The given numbers are
According to the mathematicians' insight, they have the same greatest
common divisor as
And these two have the same greatest common divisor as
And here we seem stuck because
0 is nothing expected. But,
0 can be evenly divided by every number, so we have
found our answer:
Working through the example not only explains the idea but also suggests
how to discover the case with a trivial solution. When the smaller of the
two numbers is
0, the result is the larger number. Putting
everything together, we get the following definition:
gcd-generative : N[>= 1] N[>=1] -> N;; to find the greatest common divisior of
m;; generative recursion:
(gcd n m)=
(gcd n (remainder m n))if
(<= m n)(define (gcd-generative n m) (local ((define (clever-gcd larger smaller) (cond [(= smaller 0) larger] [else (clever-gcd smaller (remainder larger smaller))]))) (clever-gcd (max m n) (min m n))))
local definition introduces the workhorse of the function:
clever-gcd, a function based on generative recursion. Its first
line discovers the trivially solvable case by comparing
0 and produces the matching solution. The generative step uses
smaller as the new first argument and
smaller) as the new second argument to
clever-gcd, exploiting the
If we now use
gcd-generative with our complex example from above:
(gcd-generative 101135853 45014640)
we see that the response is nearly instantaneous. A hand-evaluation shows
clever-gcd recurs only nine times before it produces the
177. In short, generative recursion has helped find us a
much faster solution to our problem.
Formulate informal answers to the four key questions for
gcd-generative and evaluate
(time (gcd-generative 101135853 45014640))
in the Interactions window.
(clever-gcd 101135853 45014640) by hand. Show only those lines
that introduce a new recursive call to
Formulate a termination argument for
Considering the above example, it is tempting to develop functions using
generative recursion. After all, they produce answers faster! This
judgment is too rash for three reasons. First, even a well-designed
algorithm isn't always faster than an equivalent structurally recursive
function. For example,
quick-sort wins only for large lists; for
small ones, the standard
sort function is faster. Worse, a badly
designed algorithm can wreak havoc on the performance of a program. Second,
it is typically easier to design a function using the recipe for structural
recursion. Conversely, designing an algorithm requires an idea of how to
generate new, smaller problems, a step that often requires deep
mathematical insight. Finally, people who read functions can easily
understand structurally recursive functions, even without much
documentation. To understand an algorithm, the generative step must be
well explained, and even with a good explanation, it may still be difficult
to grasp the idea.
Experience shows that most functions in a program employ structural recursion; only a few exploit generative recursion. When we encounter a situation where a design could use either the recipe for structural or generative recursion, the best approach is often to start with a structural version. If it turns out to be too slow, the alternative design using generative recursion should be explored. If it is chosen, it is important to document the problem generation with good examples and to give a good termination argument.
Exercise 26.3.5. Evaluate
(quick-sort (list 10 6 8 9 14 12 3 11 14 16 2))
by hand. Show only those lines that introduce a new recursive call to
quick-sort. How many recursive applications of
are required? How many recursive applications of
append? Suggest a
general rule for a list of length
(quick-sort (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14))
by hand. How many recursive applications of
required? How many recursive applications of
append? Does this
contradict the first part of the exercise?
quick-sort to the
Definitionswindow. Test the functions and then explore how fast each works on various
lists. The experiment should confirm the claim that the plain
function wins (in many comparisons) over
quick-sort for short
lists and vice versa. Determine the cross-over point. Then build a
sort-quick-sort function that behaves like
large lists and switches over to the plain
sort function for lists
below the cross-over point.
Hints: (1) Use the ideas of exercise 26.3.5 to create
test cases. (2) Develop
create-tests, a function that creates
large test cases randomly. Then evaluate
(define test-case (create-tests 10000)) (collect-garbage) (time (sort test-case)) (collect-garbage) (time (quick-sort test-case))
The uses of
collect-garbage helps DrScheme deal with large
56 The concept of observably equivalent functions and expressions plays a central role in the study of programming languages and their meaning.
57 The material on the greatest common divisor was suggested by John Stone.