Resampling Stats

Order
order online
printed form

Software
Excel
Matlab
XLMiner
Downloads

Books, etc. Intro text online
Articles
Bibliographies

Courses Internet course

Teaching Teaching With RS
Teaching Information
What Students Say
What Teachers Say
What Reviewers Say
What Authors Say

Support User Guides
Troubleshooting

About Contact Mailing List

 
About Resampling



| Basic Commands | Probability Puzzles | Hypothesis Test, Count Data | Hypothesis Test, Measured Data | Confidence Interval, Count Data | Confidence Interval, Measured Data | Association / Correlation | Regression | Other Examples |

Drillhole

Problem

The file "drill100.dat" contains 100 measurements of the diameter of drill holes (see Gunter, 1991). What is the confidence interval for the mean?

Resampling Procedure

We will use a bootstrap method to estimate the confidence interval for the mean. We seek to know how reliable our estimate of the mean is. How much might it differ from one sample to the next? If we had time and money, we could take lots of additional samples and learn how much the mean diameter changes from sample to sample. Lacking time and money, we need a hypothetical universe to draw samples from. What is our best guess about what such a universe might look like? It is the observed sample. So, we can create our hypothetical universe by simply copying our original sample over and over until we have, say, millions of copies of observation #1, millions of copies of observation #2, and so forth.

Now we can draw samples to see how they behave and to learn how variable the mean is from one to the next. Actually, even on a computer, replicating millions of observations is tedious, so we use a shortcut - in drawing each sample, we simply replace each observation after selecting it. This achieves the same effect as replicating an infinitely large universe from our sample. This procedure - drawing a sample with replacement from the original sample - is called the bootstrap.

  1. Copy the 100 measurements onto 100 marbles, which are put into an urn.
  2. Sample with replacement 100 times, recording the values drawn.
  3. Compute the mean of these values, and save the result in a scoreboard.
  4. Repeat steps 2-3 1,000 times. Determine the numeric interval that includes 95% of the means recorded in the scoreboard: This is a .95 bootstrap confidence interval (technically it is called a bootstrap percentile interval).

Computer Implementation In Resampling Stats

READ file "drill100.dat" diam
BOXPLOT diam

this is a useful way of looking at data, especially good for detecting outliers

REPEAT 1000
  SAMPLE 100 diam diam$

generate a simulated sample "diam$"

  MEAN diam$ mean$
  SCORE mean$ scrboard

save the mean of the simulated sample onto the scoreboard

END
PERCENTILE scrboard (2.5 97.5) interval

this range includes 95% of all values

PRINT interval
HISTOGRAM scrboard

vector "scrboard" holds the simulated mean$ values

Results

Frequency histogram of the mean diameter of drill holes

interval = 196.5 - 197.9 (this is the 95% confidence range for bootstrapped medians)

Conclusion

The 95% bootstrap confidence interval for the mean lies between 196.5 and 197.9.

References

Gunter, B. (1991, December). Bootstrapping: How to make something from almost nothing and get statistically valid answers. Quality Progress, pp. 97-103.


Home | Order | Software | Books | Courses | Teaching | Support | About | Search | Contact | Mailing List

Site Design by NEW TARGET
Site Hosted by Hagen Hosting
© 2003 Resampling Stats, Inc.
Contact Resampling Stats