At first glance, the algorithms move-until-out
and
quick-sort
have little in common. One processes structures; the
other processes lists. One creates a new structure for the generative step;
the other splits up a list into three pieces and recurs on two of them. In
short, a comparison of the two examples of generative recursion suggests
that the design of algorithms is an ad hoc activity and that it is
impossible to come up with a general design recipe. A closer look, however,
suggests a different picture.
First, even though we speak of algorithms as processes that solve problems, they are still functions that consume and produce data. In other words, we still choose data to represent a problem, and we must definitely understand the nature of our data if we wish to understand the process. Second, we describe the processes in terms of data, for example, ``creating a new structure'' or ``partitioning a list of numbers.'' Third, we always distinguish between input data for which it is trivial to produce a solution and those for which it is not. Fourth, the generation of problems is the key to the design of algorithms. Although the idea of how to generate a new problem might be independent of a data representation, it must certainly be implemented for whatever form of representation we choose for our problem. Finally, once the generated problems have been solved, the solutions must be combined with other values.
Let us examine the six general stages of our structural design recipe in light of our discussion:
move-until-out
the process is trivial and doesn't need more than a
few words. For others, including, quick-sort
, the process relies
on a non-trivial idea for its generative step, and its explanation requires
a good example such as the one in figure 67.
(define (generative-recursive-fun problem) (cond [(trivially-solvable? problem) (determine-solution problem)] [else (combine-solutions ... problem ... (generative-recursive-fun (generate-problem-1 problem)) (generative-recursive-fun (generate-problem-n problem)))]))
What is a trivially solvable problem?
What is a corresponding solution?
How do we generate new problems that are more easily solvable than the original problem? Is there one new problem that we generate or are there several?
Is the solution of the given problem the same as the solution of (one of) the new problems? Or, do we need to combine the solutions to create a solution for the original problem? And, if so, do we need anything from the original problem data?
To define the algorithm, we must express the answers to these four questions in terms of our chosen data representation.
Exercise 26.0.7. Formulate informal answers to the four key questions for the problem of modeling a ball's movement across a canvas until it is out of bounds. Solution
Exercise 26.0.8.
Formulate informal answers to the four key questions for the
quick-sort
problem. How many instances of
generate-problem
are there?
Solution
Unfortunately, the standard recipe is not good enough for the design of algorithms. Up to now, a function has always produced an output for any legitimate input. That is, the evaluation has always stopped. After all, by the nature of our recipe, each natural recursion consumes an immediate piece of the input, not the input itself. Because data is constructed in a hierarchical manner, this means that the input shrinks at every stage. Hence the function sooner or later consumes an atomic piece of data and stops.
With functions based on generative recursion, this is no longer true. The internal recursions don't consume an immediate component of the input but some new piece of data, which is generated from the input. As exercise 25.1.1 shows, this step may produce the input over and over again and thus prevent the evaluation from ever producing a result. We say that the program LOOPS or is in an INFINITE LOOP.
In addition, even the slightest mistake in translating the process
description into a function definition may cause an infinite loop. The
problem is most easily understood with an example. Consider the following
definition of smaller-items
, one of the two ``problem
generators'' for quick-sort
:
;;smaller-items : (listof number) number -> (listof number)
;; to create a list with all those numbers onalon
;; that are smaller than or equal tothreshold
(define (smaller-items alon threshold) (cond [(empty? alon) empty] [else (if (<= (first alon) threshold) (cons (first alon) (smaller-items (rest alon) threshold)) (smaller-items (rest alon) threshold))]))
Instead of <
it employs <=
to compare numbers. As a
result, this function produces (list 5)
when applied to
(list 5)
and 5
.
Worse, if the quick-sort
function from figure 68 is
combined with this new version of smaller-items
, it doesn't produce
any output for (list 5)
:
(quick-sort (list 5))
= (append (quick-sort (smaller-items 5 (list 5)))
(list 5)
(quick-sort (larger-items 5 (list 5))))
= (append (quick-sort (list 5))
(list 5)
(quick-sort (larger-items 5 (list 5))))
The first recursive use demands that quick-sort
solve the problem
of sorting (list 5)
-- but that is the exact problem that we
started with. Since this is a circular evaluation, (quick-sort
(list 5))
never produces a result. More generally, there is no guarantee
that the size of the input for a recursive call brings us closer to a
solution than the original input.
The lesson from this example is that the design of algorithms requires one
more step in our design recipe: a TERMINATION ARGUMENT,
which
explains why the process produces an output for every input and how the
function implements this idea; or a warning, which
explains when the process may not terminate.
For quick-sort
, the argument might look like this:
At each step,Without such an argument an algorithm must be considered incomplete.quick-sort
partitions the list into two sublists usingsmaller-items
andlarger-items
. Each function produces a list that is smaller than the input (the second argument), even if the threshold (the first argument) is an item on the list. Hence each recursive application ofquick-sort
consumes a strictly shorter list than the given one. Eventually,quick-sort
receives and returnsempty
.
A good termination argument may on occasion also reveal additional
termination cases. For example, (smaller-items N (list N))
and
(larger-items N (list N))
always produce empty
for any
N
. Therefore we know that quick-sort
's answer for
(list N)
is (list N)
.55 To add this
knowledge to quick-sort
, we simply add a cond
-clause:
(define (quick-sort alon) (cond [(empty? alon) empty] [(empty? (rest alon)) alon] [else (append (quick-sort (smaller-items alon (first alon))) (list (first alon)) (quick-sort (larger-items alon (first alon))))]))
The condition (empty? (rest alon))
is one way to ask
whether alon
contains one item.
Figure 69 summarizes the suggestions on the design of algorithms. The dots indicate that the design of an algorithm requires a new step: the termination argument. Read the table in conjunction with those of the preceding chapters.
Exercise 26.1.1.
Define the function tabulate-div
, which accepts a number n
and tabulates the list of all of its divisors, starting with
1
and ending in n
. A number d
is a divisior of a
number n
if the remainder of dividing n
by d
is
0
, that is, (= (remainder n d) 0)
is true. The smallest
divisior of any number is 1
; the largest one is the number
itself.
Solution
Exercise 26.1.2.
Develop the function merge-sort
, which sorts a list of numbers in
ascending order, using the following two auxiliary functions:
The first one, make-singles
, constructs a list of
one-item lists from the given list of numbers. For example,
(equal? (make-singles (list 2 5 9 3)) (list (list 2) (list 5) (list 9) (list 3)))
The second one, merge-all-neighbors
, merges pairs of neighboring
lists. More specifically, it consumes a list of lists (of numbers) and
merges neighbors. For example,
(equal? (merge-all-neighbors (list (list 2) (list 5) (list 9) (list 3))) (list (list 2 5) (list 3 9))) (equal? (merge-all-neighbors (list (list 2 5) (list 3 9))) (list (list 2 3 5 9)))
In general, this function yields a list that is approximately half as long as the input. Why is the output not always half as long as the input?
Make sure to develop the functions independently.
The function merge-sort
first uses make-singles
to create
a list of single lists; then it relies on merge-all-neighbors
to
shorten the list of lists until it contains a single list. The latter is
the result.
Solution
The template for algorithms is so general that it even covers functions based on structural recursion. Consider the version with one termination clause and one generation step:
(define (generative-recursive-fun problem) (cond [(trivially-solvable? problem) (determine-solution problem)] [else (combine-solutions problem (generative-recursive-fun (generate-problem problem)))]))
If we replace trivially-solvable?
with empty?
and
generate-problem
with rest
, the outline is
a
template for a list-processing function:
(define (generative-recursive-fun problem) (cond [(empty? problem) (determine-solution problem)] [else (combine-solutions problem (generative-recursive-fun (rest problem)))]))
Exercise 26.2.1.
Define determine-solution
and combine-solutions
so that
the function generative-recursive-fun
computes the length of its
input.
Solution
This discussion raises the question of whether there is a difference between between functions based on structural recursion and those based on generative recursion. The answer is ``it depends.'' Of course, we could say that all functions using structural recursion are just special cases of generative recursion. This ``everything is equal'' attitude, however, is of no help if we wish to understand the process of designing functions. It confuses two classes of functions that are designed with different approaches and that have different consequences. One relies on a systematic data analysis and not much more; the other requires a deep, often mathematical, insight into the problem-solving process itself. One leads programmers to naturally terminating functions; the other requires a termination argument.
A simple inspection of a function's definition quickly shows whether a
function uses structural or generative recursion. All self-applications of
a structurally recursive function always receive an immediate component of
the current input for further processing. For example, for a
cons
tructed list, the immediate components are the first
item and the rest
of the list. Hence, if a function consumes a
plain list and its recursive use does not consume the rest of the list,
its definition is not structural but generative. Or, put positively,
properly recursive algorithms consume newly generated input, which may or
may not contain components of the input. In any case, the new piece of data
represents a different problem than the given one, but still a problem of
the same general class of problems.
A user cannot distinguish sort
and quick-sort
. Both
consume a list of numbers; both produce a list that consists of the same
numbers arranged in ascending order. To an observer, the functions are
completely equivalent.56 This raises the question of which of the two a
programmer should provide. More generally, if we can develop a function
using structural recursion and an equivalent one using generative
recursion, what should we do?
To understand this choice better, let's discuss another classical example of generative recursion from mathematics: the problem of finding the greatest common divisor of two positive natural numbers.57 All such numbers have at least one divisor in common: 1. On occasion, this is also the only common divisor. For example, 2 and 3 have only 1 as common divisor because 2 and 3, respectively, are the only other divisors. Then again, 6 and 25 are both numbers with several divisors:
6 is evenly divisible by 1, 2, 3, and 6;
25 is evenly divisible by 1, 5, and 25.
Still, the greatest common divisior of 25 and 6 is 1. In contrast, 18 and 24 have many common divisors:
18 is evenly divisible by 1, 2, 3, 6, 9, and 18;
24 is evenly divisible by 1, 2, 3, 4, 6, 8, 12, and 24.
The greatest common divisor is 6.
Following the design recipe, we start with a contract, a purpose statement, and a header:
;;gcd : N[>= 1] N[>= 1] -> N
;; to find the greatest common divisior ofn
andm
(define (gcd n m) ...)
The contract specifies the precise inputs: natural
numbers that are greater or equal to 1
(not 0
).
Now we need to make a decision whether we want to pursue a design based on
structural or on generative recursion. Since the answer is by no means
obvious, we develop both. For the structural version, we must consider
which input the function should process: n
, m
, or both.
A moment's consideration suggests that what we really need is a function
that starts with the smaller of the two and outputs the first
number smaller or equal to this input that evenly divides both n
and m
.
|
We use local
to define an appropriate auxiliary function: see
figure 70. The conditions ``evenly divisible'' have
been encoded as (= (remainder n i) 0)
and (= (remainder m
i) 0)
. The two ensure that i
divides n
and m
without a remainder. Testing gcd-structural
with the examples
shows that it finds the expected answers.
Although the design of gcd-structural
is rather straightforward,
it is also naive. It simply tests for every number whether it divides
both n
and m
evenly and returns the first such number.
For small natural numbers, this process works just fine. Consider the
following example, however:
(gcd-structural 101135853 45014640)
The result is 177
and to get there gcd-structural
had to
compare 101135676, that is, 101135853 - 177, numbers. This is a large effort
and even reasonably fast computers spend several minutes on this task.
Exercise 26.3.1.
Enter the definition of gcd-structural
into the Definitions
window and evaluate (time (gcd-structural 101135853 45014640))
in
the Interactions
window.
Hint: After testing gcd-structural
conduct the performance tests
in the Full Scheme language (without debugging), which evaluates expressions
faster than the lower language levels but with less protection. Add
(require-library "core.ss")
to the top of the Definitions
window. Have some reading handy!
Solution
Since mathematicians recognized the inefficiency of the ``structural
algorithm'' a long time ago, they studied the problem of finding divisiors
in more depth. The essential insight is that for two natural numbers
larger
and smaller
, their greatest common divisor is
equal to the greatest common divisior of smaller
and the
remainder
of larger
divided into smaller
. Here
is how we can put this insight into equational form:
(gcd larger smaller) = (gcd smaller (remainder larger smaller))
Since (remainder larger smaller)
is smaller than both
larger
and smaller
, the right-hand side use of
gcd
consumes smaller
first.
Here is how this insight applies to our small example:
The given numbers are 18
and 24
.
According to the mathematicians' insight, they have the same greatest
common divisor as 18
and 6
.
And these two have the same greatest common divisor as 6
and
0
.
And here we seem stuck because 0
is nothing expected. But,
0
can be evenly divided by every number, so we have
found our answer: 6
.
Working through the example not only explains the idea but also suggests
how to discover the case with a trivial solution. When the smaller of the
two numbers is 0
, the result is the larger number. Putting
everything together, we get the following definition:
;;gcd-generative : N[>= 1] N[>=1] -> N
;; to find the greatest common divisior ofn
andm
;; generative recursion:(gcd n m)
=(gcd n (remainder m n))
if(<= m n)
(define (gcd-generative n m) (local ((define (clever-gcd larger smaller) (cond [(= smaller 0) larger] [else (clever-gcd smaller (remainder larger smaller))]))) (clever-gcd (max m n) (min m n))))
The local
definition introduces the workhorse of the function:
clever-gcd
, a function based on generative recursion. Its first
line discovers the trivially solvable case by comparing smaller
to
0
and produces the matching solution. The generative step uses
smaller
as the new first argument and (remainder larger
smaller)
as the new second argument to clever-gcd
, exploiting the
above equation.
If we now use gcd-generative
with our complex example from above:
(gcd-generative 101135853 45014640)
we see that the response is nearly instantaneous. A hand-evaluation shows
that clever-gcd
recurs only nine times before it produces the
solution: 177
. In short, generative recursion has helped find us a
much faster solution to our problem.
Exercise 26.3.2.
Formulate informal answers to the four key questions for
gcd-generative
.
Solution
Exercise 26.3.3.
Define gcd-generative
and evaluate
(time (gcd-generative 101135853 45014640))
Evaluate (clever-gcd 101135853 45014640)
by hand. Show only those lines
that introduce a new recursive call to
clever-gcd
.
Solution
Exercise 26.3.4.
Formulate a termination argument for
gcd-generative
.
Solution
Considering the above example, it is tempting to develop functions using
generative recursion. After all, they produce answers faster! This
judgment is too rash for three reasons. First, even a well-designed
algorithm isn't always faster than an equivalent structurally recursive
function. For example, quick-sort
wins only for large lists; for
small ones, the standard sort
function is faster. Worse, a badly
designed algorithm can wreak havoc on the performance of a program. Second,
it is typically easier to design a function using the recipe for structural
recursion. Conversely, designing an algorithm requires an idea of how to
generate new, smaller problems, a step that often requires deep
mathematical insight. Finally, people who read functions can easily
understand structurally recursive functions, even without much
documentation. To understand an algorithm, the generative step must be
well explained, and even with a good explanation, it may still be difficult
to grasp the idea.
Experience shows that most functions in a program employ structural recursion; only a few exploit generative recursion. When we encounter a situation where a design could use either the recipe for structural or generative recursion, the best approach is often to start with a structural version. If it turns out to be too slow, the alternative design using generative recursion should be explored. If it is chosen, it is important to document the problem generation with good examples and to give a good termination argument.
(quick-sort (list 10 6 8 9 14 12 3 11 14 16 2))
by hand. Show only those lines that introduce a new recursive call to
quick-sort
. How many recursive applications of quick-sort
are required? How many recursive applications of append
? Suggest a
general rule for a list of length N
.
Evaluate
(quick-sort (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14))
by hand. How many recursive applications of quick-sort
are
required? How many recursive applications of append
? Does this
contradict the first part of the exercise?
Solution
Exercise 26.3.6.
Add sort
and quick-sort
to the
Definitionswindow. Test the functions and then explore how fast each works on various
lists. The experiment should confirm the claim that the plain sort
function wins (in many comparisons) over quick-sort
for short
lists and vice versa. Determine the cross-over point. Then build a
sort-quick-sort
function that behaves like quick-sort
for
large lists and switches over to the plain sort
function for lists
below the cross-over point.
Hints: (1) Use the ideas of exercise 26.3.5 to create
test cases. (2) Develop create-tests
, a function that creates
large test cases randomly. Then evaluate
(define test-case (create-tests 10000)) (collect-garbage) (time (sort test-case)) (collect-garbage) (time (quick-sort test-case))
The uses of collect-garbage
helps DrScheme deal with large
lists.
Solution
55 Of course, we could have just argued that the sorted version of a one-item list is the list, which is the basis of exercise 25.2.2.
56 The concept of observably equivalent functions and expressions plays a central role in the study of programming languages and their meaning.
57 The material on the greatest common divisor was suggested by John Stone.