Resampling Stats

Order
order online
printed form

Software
Excel
Matlab
XLMiner
Downloads

Books, etc. Intro text online
Articles
Bibliographies

Courses Internet course

Teaching Teaching With RS
Teaching Information
What Students Say
What Teachers Say
What Reviewers Say
What Authors Say

Support User Guides
Troubleshooting

About Contact

 
About Resampling



| Basic Commands | Probability Puzzles | Hypothesis Test, Count Data | Hypothesis Test, Measured Data | Confidence Interval, Count Data | Confidence Interval, Measured Data | Association / Correlation | Regression | Other Examples |

Fitness-2

[sum-of-absolute-deviations version; see FITNESS-1 for the chi-squared version]

Problem

At the Cooper Institute for Aerobics Research, the physical fitness of approximately 10,000 men was tested twice with a gap of 4.9 ± 4.1 years between tests (Blair et al., 1995). Fitness was defined as the ability to reach 85% of one's age-adjusted heart rate during a treadmill test. Some men were fit at both tests; others were unfit at one or more of the examinations. Death rates from both heart disease and all causes were monitored for approximately five years after the last examination. Death rates from all causes are shown in the table below. Can we conclude that there is a significant reduction in death rates among fit men?

In FITNESS-1, the possible relationship between aerobic fitness and mortality rates was analyzed using a sum-of-squares method, equivalent to conventional chi-squared. In this version, the same data will be analyzed using the sum of absolute deviations, unsquared. This method gives less emphasis to extreme values.

Look at observed deaths, along with the hypothetical figure "expected deaths." Expected deaths are calculated by distributing the 223 deaths among the four groups in proportion to the number of men in each.

Do observed deaths depart from expected deaths to a greater extent than chance might produce? We measure the extent of this departure with the sum of the absolute deviations from expected.

Fitness-2 Table. Death Rates Of Fit and Unfit Men

Fitness at test 1 Fitness at test 2 # of men Deaths, all causes Expected deaths Difference
Unfit Unfit 373 32 9 +23
Unfit Fit 650 25 15 +10
Fit Unfit 221 9 5 +4
Fit Fit 8533 157 195 -38
Totals ---- 9777 223

Note. Data are from Blair et al., 1995. The sum of absolute values of differences = 75.

Null hypothesis (H0): All groups share the same mortality rate. Alternative hypothesis (H1): Unfit men tend to have higher mortality.

Resampling Procedure

  1. Take 9,777 pieces of paper. Write "dead" on 223 of these pieces, "live" on the rest.
  2. Shuffle the paper. Then draw a sample of 373 pieces, labeling them "U/U" (to represent the unfit/unfit group).
  3. Count the number of "dead" in this sample.
  4. Similarly, draw samples of 650 and label them "U/F" (to represent unfit/fit), 221 "F/U," and 8,533 "F/F," and count the number dead in each group.
  5. Because the samples were drawn without replacement, there will always be 223 "dead," for an average death rate of .0228. The expected death rates in samples of size 373 will therefore be 9, for samples of 650 will be 15, and so forth. Calculate absolute differences between the expected and observed deaths. Record the sum of these absolute differences in the scoreboard.
  6. Repeat (2-5) 1,000 times.
  7. Determine how frequently the sum of squared differences in the simulation is at least as large as 76 (the observed value).

Computer Implementation in Resampling Stats

MAXSIZE default 10000

we must tell Resampling Stats to leave sufficient room for larger vectors

DATA (9 15 5 195) expected

set up a vector holding the expected values, calculated from the size of each group

URN 223#8 9556#0 allmen

#8 represents deaths, #0 survivors

REPEAT 100
  	SHUFFLE allmen allmen
  	TAKE allmen  1,373 uu$

simulate the group of 373 men who were unfit in both tests

  	COUNT uu$ =8 uudeath$

determine the number of deaths in this simulated sample group

  	TAKE allmen 374,1023 uf$

simulate the group of 650 men who were unfit at first, but fit later

  	COUNT uf$ =8 ufdeath$
  	TAKE allmen 1026,1246 fu$

simulate the group of 221 men who were fit at first test but unfit at the second test

  	COUNT fu$=8 fudeath$
  	TAKE allmen 1247,9779 ff$

simulate the remaining 8,553 men who were fit in both tests

  	COUNT ff$=8 ffdeath$
  	CONCAT uudeath$ fudeath$ ufdeath$ ffdeath$ deaths$

put the results together in one vector

  	SUMABSDEV deaths$ expected sumdiff

in one command, Resampling Stats finds the sum of the absolute differences between two vectors

  	SCORE sumdiff scrboard
END
HISTOGRAM scrboard
COUNT scrboard >=75 k
DIVIDE k 100 prob
PRINT prob

Results

Frequency histogram of resampled sum of deviations from expected

prob = 0

Conclusion

The sum of absolute deviations from the expected was 75 for the study. In our resampling simulation of the null hypothesis, the maximum sum of differences was 50. Since the real-life value was never encountered in the random resamplings, we can reject the null hypothesis and conclude that fit men had lower death rates.

References

Blair, S.N., Kohl, H.W., Barlow, C.E., Paffenbarger, R.S., Gibbons, L.W., & Macera, C.A. (1995). Changes in physical fitness and all-cause mortality: A prospective study of healthy and unhealthy men. Journal of the American Medical Association, 273(14), 1093-1098.


Home | Order | Software | Courses | Teaching | Support | About | Search | Contact

Site Design by NEW TARGET
Site Hosted by Hagen Hosting
© 2003 statistics.com LLC
Visit statistics.com