PROBABILITY AND STATISTICS:
EXPERIMENTAL RESULTS OF A RADICALLY
DIFFERENT TEACHING METHOD
By
Julian L. Simon, David T. Atkinson and Carolyn Shevokas
Reprinted from the AMERICAN MATHEMATICAL MONTHLY
Vol. 83, No. 9, November 1976
pp. 733-739
Reprinted from the AMERICAN MATHEMATICAL MONTHLY
Vol. 83, No. 9, November 1976
pp. 733-739
PROBABILITY AND STATISTICS:
EXPERIMENTAL RESULTS OF A RADICALLY DIFFERENT TEACHING METHOD
Julian L. Simon, David T. Atkinson and Carolyn Shevokas
Introduction. With the Monte Carlo method, students from high
school to graduate school can quickly acquire the ability to handle
probabilistic problems of daily living or scientific research. And
the students understand what they are doing, with little danger of
the formula-grabbing which too often afflicts conventional analytic
methods.
The Illinois procedure for teaching the Monte Carlo method has
been used since 1965 with success: (a) for teaching research
methods to graduate students in several fields at the University of
Illinois who have already had one or several conventional
statistics courses, but who nevertheless find themselves
insufficiently equipped to handle the statistical problems in
research projects; (b) with undergraduates in research methods
courses; (c) with undergraduates as part of conventional statistics
courses; and (d) with high school students down to age 13 or 14, in
the U.S. (Simon and Holmes, 1969), in Israel and in Puerto Rico.
The results seem successful to the teachers and to the
students, as evidenced by the teachers' judgments and the students'
answers to informal questionnaires. But such "soft" evidence is
insufficient to convince skeptics þ which is perhaps as it ought to
be. Harder evidence is therefore needed. To supply that evidence
is the task of this paper.
We first recapitulate the method and its logic. Then we
describe three experiments that test the value of the method in a
variety of class settings.
The Monte Carlo method is not offered as a successor to
analytic methods. Rather, it can be an underpinning for analytic
teaching to help students understand analytic methods better. But
it is also a workable and easily-taught alternative for students
who will never study conventional analytic methods to the point of
practical mastery þ and this includes most students at all
educational levels. It may be especially useful for the
introduction to statistics of mathematically-disadvantaged
students. (But please do not infer from this that the method is
intellectually inferior; the method is logically acceptable and
intuitively instructive for all students.)
It must be emphasized that the Monte Carlo method as described
here really is intended as an alternative to conventional analytic
methods in actual problem-solving practice. This method is not a
pedagogical device for improving the teaching of conventional
methods. This is quite different than the past use of the Monte
Carlo method to help teach sampling theory, the binomial theorem
and the central limit theorem. The point that is usually hardest
to convey to teachers of statistics is that the method suggested
here really is a complete break with conventional thinking, rather
than a supplement to it or an aid in teaching it. That is, the
simple Monte Carlo method described here is complete in itself for
handling most þ perhaps all þ problems in probability and
statistics.
The Monte Carlo method always provides a logically acceptable
solution. But more specifically with respect to statistical
hypothesis testing, the Monte Carlo tests based on a randomization
logic have properties that statisticians are now finding attractive
because they are more robust than traditional parametric tests.
(For the test of differences in means between two groups based on
Fisher's randomization test, see Dwass 1957, and Chung and Fraser,
1958; for a variety of other tests see Simon, 1969, Chapters 23 and
24.) Hence the Monte Carlo test is often a better scientific
choice than the conventional test þ in addition to its padagogical
advantages.
To illustrate the method, here is a sample question and
examination answer by a high school student (one who qualified for
an experimental course) after just six hours of classroom
instruction:
John tells you that with his old method of shooting foul shots
in basketball his average (over a long period of time) was 6. Then
he tried a new method of shooting and scored successes with nine of
his first ten shots. Should he conclude that the new method is
really better than the old method?
Student A.S.:
(a) Take twelve hearts to represent hits in shooting and
eight spades to be misses; this is John's old probability in
scoring.
(b) Shuffle, draw a card and record "hit" or "miss", replace
it and shuffle.
(c) Repeat ten times altogether for one trial.
(d) See how many times nine hits or more come up on ten
tries.
1. Hit, hit, miss, miss, h, h, h, h, h, m, 7/10 hits9.6/10
2. 7/10 10. 9/10
3. 8/10 11. 5/10
4. 4/10 12. 7/10
5. 6/10 13. 8/10
6. 7/10 14. 6/10
7. 7/10 15. 8/10
Only 1 time in 15 times will 9/10 shots be made by the old .6
chance, so it seems probable here that John's new method helped.
The Monte Carlo method is not explained by the instructor.
Rather it is discovered by the students. With a bit of guidance
the students invent, from scratch, the procedures for solution.
For example, at the beginning of the first class the instructor may
ask, "What are the chances that if you have four children three of
them will be girls?" A few students do some calculations without
success (in a naive class); the rest fidget. Then the students say
that they don't know how to get the answer. The instructor presses
the class to think of some way to come up with an answer. Someone
suggests in jest that everyone in the class should go out and have
four children. The instructor chooses to take this seriously. He
says that this is a very good suggestion, though it has some
obvious drawbacks. Someone suggests substituting a coin for a
birth. This raises the issue of whether it is reasonable to
approximate a 106:100 event with a 50-50 coin, and what is
reasonable under various conditions. The instructor points out
that the class still has no answer. Someone suggests that each
student throw four coins. Someone else amends this by saying that
four flips of one coin are just as good. The instructor questions
whether the two methods are equivalent, and the class eventually
agrees that they are. Finally, each student performs a trial, the
data are collected, and an estimate is made. Someone wonders how
good the estimate is. Someone else suggests that the experiment be
conducted several more times to see how much variation there is.
The meaning of the concept "chances" comes up in the
discussion, and "probability" is defined pragmatically. By this
process of self-discovery, students develop useful operating
definitions of other necessary concepts such as "universe,"
"trial," "estimate," and so on. And together they invent þ after
false starts and class corrections þ sound approaches to easy and
not-so-easy problems in probability and statistics. For example,
with a bit of guidance, an average university class can be brought
to re-invent such devices as a Monte Carlo version of Fisher's
randomization test. In an earlier report (Simon, 1969, Chapters 23
and 24) the flexibility and range of the Monte Carlo method is
shown in problems ranging from permutations to correlation to
randomization tests.
In this manner, the students learn more than how to do
problems. They gain the excitement of true intellectual discovery.
And they come to understand something of the nature of mathematics
and its creation.
Though the experience of shuffling cards and counting tabled
random numbers is educational at first, it tends to be a nuisance
after awhile, and a deterrent to the use of the method.
Furthermore, in some problems, the sample size required for the
desired accuracy makes such hand methods onerous if not impossible.
Therefore, a computer program, SIMPLE, has been developed that will
perform the necessary operations rapidly and yet can be used
immediately by a person with absolutely no computer experience.
The SIMPLE program is also designed to be used as the method of
choice for computer experience. The SIMPLE program is also
designed to be used as the method of choice for sophisticated
statisticians in many sorts of applications. This program is
described in Simon and Weidenfeld (1974) , and materials are
available upon request.
A systematic Monte Carlo method is taught at the University of
Illinois: this is an important difference from some examples in
the literature of ad hoc Monte Carlo problem solution, e.g.
Zelinka, 1973. The student is taught to work in a series of
discrete steps. The first step is the construction of the universe
whose behavior one is interested in. The second step (or set of
steps) is the drawing of a sample from that universe. The third
step is the computation of the statistic of interest, and, in
inferential statistics, comparison of the experimental statistic to
the "observed" or "bench-mark" statistic. The fourth step is the
repetition of the sampling procedure a large number of times. And
then the fifth step is the calculation of the proportion of
"successes" to experimental trials, which estimates the probability
of the event in which one is interested.
The experiments. Three controlled experimental tests of the
pedagogical efficiency of the Monte Carlo method have now been
completed.
The University of Illinois Experiment: The experimental
situation was a one-semester elementary statistics class of 25
mostly economics and business undergraduate majors in 1973 at the
University of Illinois. The course, taught by Simon, was primarily
a conventional statistics course, using a conventional text (Spurr
and Bonini, rev. ed., 1973); the Monte Carlo method was taught only
as a supplement, with no reading on it other than the simulation
chapter in Spurr and Bonini and the Zelinka article (1973) and
suggested reading in Simon (1969, Chapters 23-25). All problems
that were treated by the Monte Carlo method in class were also
demonstrated by analytic methods, whereas many problems were solved
by analytic methods that were not treated in class by the Monte
Carlo method. Therefore, analytic methods had a very large
advantage over the Monte Carlo method in student time and
attention, both in reading and in class.
Among the ten questions on the final exam (of which the
student had to answer 8), there were four that the student could
choose whether to answer by analytic methods or by Monte Carlo; the
question given earlier is an example of these four questions. The
choices of method by the students on the optional-method question
give an indication of the usefulness of the Monte Carlo method.
Some additional conditions relevant to the experiment: The
students could bring books and notes. (A closed-book exam, where
formulae had to be remembered, would disadvantage analytic methods
relative to Monte Carlo methods.) And the four optional-method
problems were extremely simple ones for the use of analytic
methods. (Complex problems would tend to improve the relative
performance of the Monte Carlo method, because complexity is its
comparative advantage.)
The results were as follows:
1. Almost every student used the Monte Carlo method for some
question. This is the most exciting result of the experiment,
because it suggests that the method has some usefulness to almost
everyone. In total, more than half of the answers used the Monte
Carlo method (44 of 86).
2. Almost every student did some questions by analytic
methods. This implies that teaching the Monte Carlo method does
not prevent the learning and use of analytic methods: that is,
Monte Carlo does not drive out analytic methods. This is also a
gratifying result.
3. There is a slightly-greater propensity for students who
did better on the examination as a whole to do a larger proportion
of problems by the Monte Carlo method. But the relationship is
certainly not strong, which suggests that the Monte Carlo method is
useful both to the good and to the less-good students. (And the
lack of strong relationship also implies that we need not worry
that the students who got better scores on the exam did so because
they used the Monte Carlo more extensively and were therefore
graded more easily.)
4. On each question some students used analytic methods and
others used Monte Carlo methods. This shows that the Monte Carlo
method is not specialized to some sorts of problems in the minds
and practices of the students.
5. The average grades that the students received were higher
on the questions answered with the Monte Carlo method than on those
questions answered with analytic methods þ 9.1 versus 7.5 on a
scale of 10.
Polk Community College Experiment: At Polk Community College,
Winter Haven, Florida, in 1974, Shevokas taught separate classes of
General Mathematics, a 6" week 17 class-hours unit in probability
and statistics, in three ways: conventional analytic method, Monte
Carlo method with computer, Monte Carlo method without computer.
The enrollments were 19, 39, and 13, respectively. Beforehand, the
groups were given a cooperative Arithmetic Achievement Test
(Educational Testing Service, 1962) and two
attitude-toward-mathematics tests (Aiken and Dreger, 1961; McCallon
and Brown, 1971; sample item: "The feeling I have toward math is
a good feeling"). The differences in results among groups were not
statistically significant, so we can safely consider that the
groups were similar to start with.
Only a mini-computer was available, and hence the types of
programs that could be offered were not satisfactory. And the
computer group had less time to learn probability because of the
time devoted to learning about the computer programs. For these
and other reasons we would have liked to confine our attention to
the non-computer aspects of the experiment, but we include the
with-computer group to increase Monte Carlo sample size.
The conventional analytic group was assigned two conventional
chapters on probability and statistics in a basic text (Meserve and
Sobel, 1973); the Monte Carlo group was given duplicated reading
materials prepared by Shevokas.
1. All students were given the same seven-question exam on
completion of the probability unit; a typical question was:
"Suppose a machine produces bolts, 10% of which are defective.
Find the probability that a box of three bolts contains at least
one defective bolt." The mean scores were: conventional, 35.8;
Monte Carlo no-computer, 58.5; Monte Carlo with-computer, 50.8, on
a basis of 100. While one could wish for higher scores altogether,
the Monte Carlo groups did better. The difference between Monte
Carlo and conventional groups is statistically significant, but
even more important, it is of an educationally significant
magnitude; the Monte Carlo no-computer group got 62% higher scores
than the conventional group.
2. The two attitude-toward-mathematics scales were again
administered afterwards. The Monte Carlo groups showed more
favorable attitudes than the conventional group, with the
non-computer Monte Carlo group being most favorable; considering
the two scales together, the post-scores differ significantly among
the groups. Perhaps most interesting, the mean changes from
"before" to "after" for the Aiken-Dreger and McCallon-Brown scales
were: conventional, þ5% and þ9%; Monte Carlo with-computer, 0% and
þ8%; Monte Carlo no-computer, +22% and +8%. To put it more
concretely, five of 19 conventional-group students had an improved
attitude, 13 a worsened attitude (one tie); among the Monte Carlo
no-computer group, 8 students had improved attitudes, 5 worsened.
(The attitudes of the Monte Carlo with-computer group were
apparently harmed by their need to spend extra hours on campus to
use the computer.)
3. It is an important result that despite an initially-cool
attitude toward the no-computer Monte Carlo method by the teacher,
she came to enjoy teaching the Monte Carlo method much more than
the conventional method, because the students reacted to the Monte
Carlo work in an interested and enthusiastic manner.
Olivet Nazarene College Experiment: At Olivet Nazarene (four
year) College, Kankakee, Illinois, during the second half of each
semester in 1974-1075 one class in Mathematics for General
Education was taught probability and statistics by Atkinson in a
conventional analytic fashion, while a second class was taught the
Monte Carlo method. Class size was 21 students in each section the
first semester; in the second semester there were 37 and 34
students, respectively, in the Monte Carlo and conventional
sections. As in the case of the Polk Junior College experiment,
students in this course generally have low skills and little
interest in mathematics. Comparable duplicated reading materials
prepared by Atkinson on the conventional and Monte Carlo methods
were distributed to the respective classes.
1. In the first semester two pre-exam quizzes were given to
each group, whereas three quizzes were given in the second
semester. These quizzes each contained 1, 2 or 3 probability or
statistical problems. On each quiz the Monte Carlo section did
better than the conventional group, achieving class mean scores as
much as twice as high as the conventional group.
2. The first part of the final exam the first semester was
"conceptual." It asked the student to analyze problem data and
describe the population, the hypotheses, and so on. The
conventional group did better, getting a mean score of 47.9
compared to 40.8 for the Monte Carlo group (t=1.25). The analogous
first part of the second semester's final exam was a 20-question
multiple-choice test on the concepts of hypothesis testing. This
time the Monte Carlo group did better, 60.3 to 51.8 (t=2.06).
3. The most important measure of performance was the second
part of the final exam containing, respectively, three and four
problems in the first and second semesters. Mean scores were:
Semester 1: Monte Carlo, 69.5; conventional, 59.4 (t-1.7).
Semester 2: Monte Carlo, 67.6; conventional, 56.6 (t=2.06).
Inspection of second-semester tests showed that the Monte Carlo
group did better on each and every question.
4. If one considers questions and answers only as "right" or
"wrong," in the second semester 45.9% of the Monte Carlo students
answered at least two questions correctly, whereas among the
conventional group only 26.5% got two or more questions right. And
the Monte Carlo group got 34.4% of the total questions right
whereas the conventional group got 19.8% of the questions right.
Comparative scoring of Monte Carlo and analytic answers requires
some judgment. But the fact that the teachers in the Polk and
Olivet Nazarene experiments (though not in the Illinois experiment)
were not initially in favor of the non-computer Monte Carlo method
provides some protection.
5. The Monte Carlo section had less mathematical ability than
the conventional section in both semesters; the Monte Carlo groups
had lower mean scores on the ACT math test, two quizzes and the
midterm exam on the algebra material taught in the first half of
the semesters, some of the differences being statistically
significant. Hence the better performance shown by the Monte Carlo
groups on the probability and statistical material was despite a
lower endowment of mathematical ability.
6. A twenty-question attitude-toward-mathematics scale
similar to the Aiken-Dreger scale was given before and after the
probability-statistics unit. In both semesters the Monte Carlo
groups began with less favorable attitudes. But by the end of the
experiment the Monte Carlo groups' attitudes toward math were more
favorable than those of the conventional groups.
7. An attitude-toward-probability-and-statistics scale was
given after the probability-statistics instruction. In the first
semester, eight of ten questions were answered more favorably among
the Monte Carlo group by substantively large and statistically
significant differences; the other two differences were tiny, and
the questions referred to future plans rather than attitudes. In
the second semester, 15 of 17 attitudes were more favorable in the
Monte Carlo group, most of the differences being large and all with
1 > 1; two other questions were very slightly more favorable in the
conventional group, with t < .23.
8. The teacher's subjective evaluation, as in other classes
where the Monte Carlo has been taught, is that the students seem
relatively interested in and enthusiastic about the material, with
a great deal of class discussion. This made for an enjoyable
experience for the teacher, despite initial doubts about the value
of the Monte Carlo method.
Conclusions. Taken as a whole, the evidence shows that the
Monte Carlo method is a tool that students can and will use to
arrive at correct answers to probabilistic-statistical problems.
Therefore, it would seem to make sense to teach students to do
standard probabilistic questions with the Monte Carlo method. In
a conventional university probability or statistics course, this
implies teaching the Monte Carlo method along with the analytic
methods. In high school or college situations in which the student
will not get a course or even a long section on probability and
statistics, this implies teaching a block of 6-10 hours of the
Monte Carlo method in the basic mathematics course so that the
student will have at least some tools at his disposal.
If one has to make a pedagogical choice between analytic and
Monte Carlo methods, it would seem that Monte Carlo is the method
of choice on a "cost/benefit" basis þ that is, it yields more
usable output per unit of learning input. But luckily one does not
usually have to make such a choice, because there can be plenty of
time in the conventional elementary course for the Monte Carlo
method to be treated along with the analytic method. And in a
situation where the Monte Carlo method and only the Monte Carlo
method might be taught þ say a high school and junior college þ the
conventional method usually has no real opportunity at present to
receive the attention that it must for students to acquire a usable
tool, and hence the conventional approach is not a real alternative
to the Monte Carlo method.
Lest this be unclear or seem to equivocate: Where there is
limited time, or where students will not be able to grasp
conventional methods firmly, we advocate teaching the Monte Carlo
approach, and perhaps that only. Where there is more time, and
where students will be able to well learn conventional methods, we
advocate (a) teaching Monte Carlo methods at the very beginning as
an introduction to statistical thinking and practice; and (b)
afterwards teaching the monte Carlo method with the conventional
method as alternatives to the same problem, to help students learn
analytic methods and to give them an alternative tool for their
use.
Teaching the Monte Carlo method also has additional
pedagogical advantages. It produces (in fact, demands) a high
level of class participation and teacher-student interaction. This
makes the class time lively and enjoyable. The method also leads
students to discover for themselves the intuitive meaning of
fundamental concepts such as independence. And it increases their
readiness to challenge the validity of the underlying data, which
they must receive in raw form for the Monte Carlo method rather
than in the defect-hiding summary form as, for example, "a
population with æ = 100 and å = 10," the sort of language in which
conventional problems are usually stated.
The advantage of the Monte Carlo Method seems to stem from its
greater simplicity in a fundamental intuitive sense due to having
fewer "working parts," and because the student never needs to take
anything on faith, especially the sort of faith that is necessary
with analytic methods that work by way of the central limit theorem
("It is shown in advanced texts that...).
Do we not owe it to our students and ourselves to at least
give the Monte Carlo a hearing and a try?
Acknowledgment. Kenneth Travers supervised Atkinson's and
Shevokas's theses at the College of Education of the University of
Illinois, from which their results are drawn; we are grateful for
Traver's important contribution to this work. We also appreciate
helpful comments in an earlier draft from Bob Bohrer.
References
D.T. Atkinson, A Comparison of the Teaching of Statistical
Inference by Monte Carlo and Analytical Methods, Ph.D. thesis,
University of Illinois, 1975.
J.H. Chung and D. A.S. Fraser, Randomization tests for a
two-sample problem, J. Amer. Statist. Assoc., 53 (September 1958)
729-35.
Meyer Dwass, Modified randomization tests for nonparametric
hypothesis, Ann. Math. Statist., 28 (March 1957) 181-187.
Educational Testing Service, Cooperative Mathematics Tests,
Arithmetic, Form A, Princeton, N.J. 1962.
A.L. Edwards, Techniques of Attitude Scale Construction,
Appleton-Century-Crofts, New York, 1957.
E.L. McCallon and J.D. Brown, A semantic differential
instrument for measuring attitude toward mathematics, J.
Experimental Education, 39 (Summer 1971) 69-79.
B.E. Meserve and M.A. Sobel, Introduction to Mathematics, 3rd
ed., Prentice-Hall, Englewood Cliffs, N.J., 1973.
Carolyn Shevokas, Using a Computer-Oriented Monte Carlo
Approach to Teach Probability and Statistics in a Community College
General Mathematics Course, Ph.D. thesis, University of Illinois,
1974.
J.L. Simon, Basic Research Methods in Social Science, Random
House, New York, 1969.
_____, with the assistance of Allen Holmes, A really new way
to teach (and do) probability and statistics, The Mathematics
Teacher, 62 (April 1969) 283-288.
_____, and Dan Weidenfeld, SIMPLE: Computer Program for Monte
Carlo Statistics Teaching, mimeo, 1975.
W.A. Spurr, and C.P. Bonini, Statistical Analysis for Business
Decisions, rev. ed., Irwin, Homewood, 1973.
Martha Zelinka, How Many Games to Complete a World Series? in
F. Mosteller, et al. (eds.) Statistics by Example, Addison-Wesley,
Reading, Mass., 1973.
DEPARTMENT OF ECONOMICS, UNIVERSITY OF ILLINOIS AT
URBANA-CHAMPAIGN, URBANA IL 61801
DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE, OLIVET
NAZARENE COLLEGE, KANKAKEE, IL 60901
DEPARTMENT OF MATHEMATICS AND PHYSICAL SCIENCE, THORNTON
COMMUNITY COLLEGE, SOUTH HOLLAND, IL 60473