Pick A Sample - Resampling

Note:  For information about software or other materials
mentioned here, please call 703-522-5410, or contact
stats@resample.com.

                               PICK A SAMPLE

                             by Ivars Peterson

                       _Science News_, July 27, 1991

    Few college students escape taking an introductory course in
statistics. But for many of these students, the lessons don't
seem to stick.  They remember the pain but not the substance.

    Such initial courses tend to turn students off, says Barbara
A. Bailar, executive director of the American Statistical
Association in Alexandria, Va.  To remedy the situation, she
says, "a lot of people are now trying to restructure that first
course in statistics."

    "We're trying to reform what we do so that we're not always
lecturing and the students just sitting there trying to
transcribe what we say into their notebooks," says David S. Moore
of Purdue University in West Lafayette, Ind., author of a number
of widely used statistics textbooks.  "Statistics teaching should
get students actively involved in working with data to discover
principles for themselves."

    The computer revolution has already wrought some salutary
changes.  New computer-based techniques, collectively known as
"resampling" have provided statisticians with innovative and
powerful methods of handling data.  At the same time, computers
have opened up novel approaches to teaching the basic concepts of
statistics.  These techniques promise to eliminate many of the
shoals encountered by novices venturing into statistical waters.

    One pioneer in computer-based approaches to statistical
education is not a statistician but an economist.  In 1967,
Julian L. Simon--now a professor of business administration at
the University of Maryland in College Park--became convinced
there had to be a better way to teach basic statistics.  This
realization was triggered by his discovery
that all four graduate students taking his course on research
methods in business had used "wildly wrong" statistical tests to
analyze data in their class projects.

    "The students had swallowed but not digested a bundle of
statistical ideas - which now misled them - taught by professors
who valued fancy mathematics even if useless or wrong," Simon
says.

    The experience led Simon to develop an alternative approach
to teaching statistics.  His method emphasizes intuitive
reasoning and involves using a computer to perform
experiments on the available data - a process known as
resampling - to get meaningful answers to statistical problems.
In his scheme, students avoid having to tangle with mysterious
formulas, cryptic tables and other forms of mathematical magic to
get their results.

    "The theory that's usually taught to elementary [statistics]
students is a watered-down version of a very complicated theory
that was developed in order to avoid a
great deal of numerical calculation," says statistician Bradley
Efron of Stanford University.  "It's really quite a hard theory,
and should be taught second, not first."

    Simon's resampling approach, refined over many years and
tested on a variety of students, is now starting to catch the
attention of that statistics and education communities.

    "With six or nine hours of instruction, students are
generally able to handle problems usually dealt with only in
advanced university courses," Simon says.  Moreover,
classroom surveys show that when his resampling techniques
replace conventional methods, students enjoy the subject much
more.

    In one sense, the resampling approach brings statistics back
to its roots.  More than 300 years ago, before the invention of
probability theory, gamblers had to rely on
experimentation - by dealing hands or throwing dice - to estimate
odds.  The computer has returned this kind of empirical approach
to the center of statistics, shifting emphasis away from the
mathematical statistical methods developed between 1800 and 1930,
when computation was slow and expensive and had to be avoided as
much as possible.

    A resampling experiment uses a mechanism such as a coin, a
die, a deck of cards or a computer's stock of random numbers to
generate sets of data that represent samples taken from some
population. The availability of such "artificial" samples permits
students - as well as researchers - to experiment systematically
with data in order to estimate probabilities and make statistical
inferences.

    Consider the following question: what are the chances that
basketball player Magic Johnson, who averages 47 percent success
in shooting, will miss eight of his next 10 shots?

    You could try to answer it by looking a formula of some kind
and plugging in numbers to calculate a probability.  In Simon's
approach, the answer emerges instead from a
numerical experiment.

    The computer generates 10 random numbers between 1 and 100.
It counts any numbers that fall between 1 and 53 as missed shots,
then generates a second set of 10 random numbers, and so on, each
time counting up the number of misses.  After repeating this
process, say, 100 times, the computer determines the number of
trials in which the player misses eight or more shots.  This
figure establishes the likelihood of the player suffering a
streak of misses, 6.5
percent in this example.

    The answer obtained through this approach is as valid as that
obtained through conventional methods, Simon says. Moreover,
students learn important lessons about statistical inference from
such an exercise.

    Stripped of the mathematical trappings associated with the
standard approach to statistics, the issue confronting a student
or researcher shifts from one of simply calculating an answer to
one of statistical inference.  Instead of asking which formulas
they should use, students begin tackling such questions as what
makes certain results
statistically "significant."

    "It becomes mostly a matter of clear thinking," Simon says.

    Consider the results that show Magic Johnson will miss eight
or more shots out of 10 tries about 6.5 percent of the time.
Because of random variation, such streaks of misses
occur naturally, even when a player is not in a slump.  Hence,
when a good player misses eight of 10 shots, it doesn't
necessarily mean that he or she must be replaced or stopped from
shooting.  The streak could quite easily occur by chance.

    The resampling approach addresses a key problem in
statistics: how to infer the "truth" from a sample of data that
may be incomplete or drawn from an ill-defined
population.  For example, suppose 10 measurements of some
quantity yield 10 slightly different values.  What is the best
estimate of the true value?

    Such problems can be solved not only by the use of randomly
generated numbers but also by the repeated use of a given set of
data.  Consider, for instance, the dilemma faced by a company
investigating the effectiveness of an additive that seems to
increase the lifetime of a battery.  One set of experiments shows
that 10 batteries with the additive lasted an average of 29.5
weeks, and that 10 batteries without the additive lasted an
average of 28.2 weeks (see table).  Should the company take
seriously the difference between the averages and invest
substantial funds in the development of additive-laced batteries?

    To test the hypothesis that the observed difference may be
simply a matter of chance and that the additive really makes no
difference, the first step involves combining the two samples,
each containing 10 results, into one group.  Next, the combined
data a thoroughly mixed, and two new samples of 10 each are drawn
from the shuffled data.  The computer calculates the new averages
and determines their difference, then repeats the process 100 or
more times.

    The company's researchers can examine the results of this
computer simulation to get an idea of how often a difference of
1.3 weeks comes up when variations are due entirely to chance.
That information enables them to evaluate the additive's "true"
efficacy.

    "The resampling approach ... helps clarify the problem,"
Simon and colleague Peter C. Bruce write in the winter 1991
CHANCE, released in late June.  "Because there are no
formulae to fall back upon, you are forced to think hard about
how best to proceed."

    In particular, "you get a much clearer idea of what
variability means," Efron says.  By seeing the possible variation
from sample to sample, you learn to recognize how variable the
data can be.

    In 1977, Efron independently invented a resampling technique
for handling a variety of complex statistical problems.  The
development and application of that technique, called the
"bootstrap," has given Simon's approach a strong theoretical
underpinning.

    In essence, the bootstrap method substitutes a huge number of
simple calculations, performed by a computer, for the complicated
mathematical formulas that play keys roles in conventional
statistical theory.  It provides a way of using the data
contained in a single sample to estimate the accuracy of the
average or some other statistical measure.
    The idea is to extract as much information as possible from
the data on hand.  Suppose a researcher has only a single data
set, consisting of 15 numbers.  In the bootstrap
procedure, the computer copies each of these values, say, a
billion times, mixes them thoroughly, then selects samples
consisting of 15 values each.  These samples can then be
used to estimate, for example, the "true" average and to
determine the degree of variation in the data.

    The term "bootstrap" - derived from the old saying about
pulling yourself up by your own bootstraps-reflects the fact that
the one available sample gives rise to all the others, Efron
says.

    Such a computationally intensive approach provides freedom
from two limiting factors that dominate traditional statistical
theory: the assumption that the data conform
to a bell-shaped curve, called the normal distribution; and the
need to focus on statistical measures whose theoretical
properties can be analyzed mathematically.

    "Part of my work is just trying to break down the prejudice
that the classical way is the only way to go," Efron says.  "Very
simple theoretical ideas can get you a long way.  You can figure
out a pretty good answer without going into the depths of
[classical] theory."

    Efron and Robert Tibshirani of the University of Toronto
describe the application of the bootstrap procedure and other
computationally intensive techniques to a variety of
problems, both old and new, in the July 26 SCIENCE.

    "Efron's work has had a tremendous influence on the route
statistics is taking," Bailar says.  "It's a very powerful
techniques - a major contribution to the theory and
practice of statistics.  We're going to see more and more
applications of it."

    But the method also has its critics.  "Not everyone is
enamored with these [resampling] techniques," says Stephen E.
Fienberg of York University in Toronto.  The skeptics argue that
"you're trying to get something for nothing," he says.  "You use
the same numbers over and over again until you get an answer that
you can't get any other way.  In
order to do that, you have to assume something, and you may live
to regret that hidden assumption later on."

    Efron and others who study and use the bootstrap method
concede that it doesn't work for every statistical problem.
Nonetheless, Efron says, it works remarkably well in many
different situations, and further theoretical work will help
clarify its limits.

    "The potential dangers for misuse are small compared to the
amount of good it does," says mathematician Persi Diaconis of
Harvard University."  I think the bootstrap is one of the most
important developments [in statistics] of the last 10 years."

    The bootstrap's success among statisticians has helped
validate Simon's use of resampling techniques to teach statistics
and to solve real problems in everyday life,
"Simon is onto something that's very good," Bailar says.  "I
think it's the wave of the future."
    "Simon's approach is very consistent with a major theme in
statistical education." says Purdue's Moore.  "Statistics
education ought to concentrate on understanding data and
variation.  There's a movement away from black box calculation
and toward more emphasis on reasoning-more emphasis on actual
experience with data and automating the calculations so that the
students have more time to think about what they're doing."

    But bringing innovative ideas into the statistics classroom
remains a slow process.  Part of the problem, says Moore, is that
many statistics courses are taught not by
statisticians but by engineers, mathematicians and others who
learned statistics the old way and haven't seen firsthand how the
computer has revolutionized statistical analysis.

    "Statisticians work with data all the time."  Moore says,
"The calculations are so messy and you have to do so many of them
that the professional practice of statistics has always relied on
automated calculation."

    Hence, experienced statisticians are probably more inclined
to accept teaching methods that rely on large amount of
computation rather than on manipulation of standard formulas.
"There's been a major movement to get computing into the
introductory classroom, and to have it revolve around real
examples where there's some payoff."  Fienberg says.

    Simon, who calls his approach "statistics without terror," is
convinced this is the way to go.  About two years ago, he
launched "The Resampling Project" to help spread the word.  The
project, conducted in association with the University of
Maryland, handles such functions as scheduling courses in
resampling statistics and providing written material and other
help to schools interested in trying these methods.

    Simon and Peter Bruce, who heads the project office,
videotaped a series of lectures this spring to demonstrate the
approach.  The University of Maryland Instructional
Television System now distributes those tapes.  Simon also has
developed a special computer language and software package that
enables a user to perform experimental trials on a wide variety
of data. With this package, students can learn how to develop an
abstract model of a real-life situation, to write a computer
program simulating that model and to draw statistical inferences
from the resulting data.  The package works equally well in the
high school or college classroom and in the laboratory or
workplace, Simon says.

    "The resampling method enables people to obtain the benefits
of statistics and probability theory without the shortcomings of
conventional methods because it is free of
mathematical formulas and is easy to understand and use," Simon
and Bruce contend in the CHANCE article.  "Our interest is in
providing a tool that researchers and decision's makers, rather
than statisticians, can use with less opportunity of error and
with sound understanding of the process."

    Simon has gone so far as to issue the following challenge in
a number of publications:

    "I will stake $5,000 in a contest against any teacher of
conventional statistics, with the winner to be decided by whose
students get the larger number of both simple and complex
realistic numerical problems correct, when teaching similar
groups of students for a limited number of class hours."
    No one has yet accepted the wager.

EXAMPLE:

BATTERY LIFETIMES (weeks)

With Additive:     33 32 31 28 31 29 29 24 30 28
Average:  29.5
Without Additive:  28 28 32 31 24 23 31 27 27 31
Average:  28.2

    A set of tests reveals that batteries with a certain additive
last an average of about 1.3 weeks longer than batteries without
the additive (above).  How strong is the evidence in favor of the
additive?  Might the difference between the averages have
occurred by chance?

    One can find out if a difference of 1.3 is unusual by mixing
together all the observations, then repeatedly drawing two new
groups of 10 numbers each.  If this procedure routinely produces
differences between the group averages greater than 1.3, then one
can't rule out chance as an explanation.

    The histogram (below) displays the differences resulting from
100 trials.  In this sample, a difference of 1.3 or greater
occurs about 29 percent of the time, which suggests that chance
can't be ruled out.

   15
    -
    -
F   -
r   -                            *
e  10                            *
q   -                            *
u   -                    *       *   *
e   -                    *     * *   *
n   -                    *     * * * *       *
c   5                    *     * * * *       * * * * *
y   -        *           *     * * * * *     * * * * *
    -        *   *   *   * * * * * * * * *   * * * * *   *
    -        *   *   * * * * * * * * * * * * * * * * *   * *
    -        * * * * * * * * * * * * * * * * * * * * *   * *
    0
     ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
                -2        -1         0         1         2

                      Difference between group averages