Resampling Stats

Order
order online
printed form

Software
Excel
Matlab
XLMiner
Downloads

Books, etc. Intro text online
Articles
Bibliographies

Courses Internet course

Teaching Teaching With RS
Teaching Information
What Students Say
What Teachers Say
What Reviewers Say
What Authors Say

Support User Guides
Troubleshooting

About Contact

 
About Resampling



| Basic Commands | Probability Puzzles | Hypothesis Test, Count Data | Hypothesis Test, Measured Data | Confidence Interval, Count Data | Confidence Interval, Measured Data | Association / Correlation | Regression | Other Examples |

Tallmen

Problem

The following table shows the frequencies with which 43 short men and 52 tall men were classified as "followers," "unclassifiable," or "leaders" (see Siegel & Castellan, 1988, p. 113). Is leadership independent of height? Are tall men more likely to get classified as leaders? More generally, is there a relationship between height and leadership classification? Our test measure the extent to which the observed results depart from what we would expect if height were independent of classification. We will use the chi-square statistic (the sum of the squared deviations from expected, divided by the expected).

Tallmen Table. Relationship Between Height and Leadership Classification

Leadership Classification Short Tall Combined
Followers 22 14 36
Unclassifiable 9 6 15
Leaders 12 32 44
Total 43 52 95

Note. Data are adapted from Siegel & Castellan, 1988, p. 113.

Null hypothesis (H0): There is no relationship between height and leadership classification. Alternative hypothesis (H1): Height and leadership classification are not independent of one another.

Resampling Procedure

First we calculate the chi-square value as the test statistic. Generate a table of observed values and also expected values (if leadership classification is independent of height) as shown below:

Leadership Classification Short Tall Expect Short Expect Tall Dev'n Short Dev'n Tall Totals
Followers 22 14 16 20 6 6 36
Unclassifiable 9 6 7 8 1 2 15
Leaders 12 32 20 24 8 8 44
Total 43 52 95

Note: How do we derive the the expected counts, say the 16 expected short followers? Forty-five percent (43 out of 95) of the men are short. Overall, there are 36 followers. If height has nothing to do with classification (i.e., is independent), we would expect 45% of those 36 followers, or 16, to be short.

The chi-square is the sum of the squared deviations from expected, divided by the expected:

(6*6/16) + (6*6/20) + (1*1/7) + (2*2/8) + (8*8/20) + (8*8/24) = 10.9

Next we repeatedly simulate what happens with a population in which leadership is independent of height. Is a chi-square of 10.9 significant? Might it occur by chance with a random association of height with classification?

  1. Take 95 marbles and write "followers" on 36, "unclassifiable" on 15, and "leaders" on 44, and put these into an urn.
  2. Take without replacement 43 marbles to represent short men and the remaining 52 balls to represent tall men.
  3. Count the number of followers, unclassifiable, and leaders in the "short" and "tall" groups. Calculate a simulated chi-square from these data. Record that value.
  4. Repeat (2) and (3) 999 times to obtain the distribution of chi-squares from a single population. How often did the simulated chi-square equal or exceed the experimental value of 10.9?

Computer Implementation in Resampling Stats

DATA (22 9 12) short

the numbers of short men in the three categories of follower, unclassifiable, and leader

DATA (14 6 32) tall

vector "tall" holds comparable information for the tall men

DATA (16 7 20) exshort

these are expected numbers if height and leadership are independent

DATA (20 8 24) extall
CONCAT exshort extall expected

put both the expected vectors together in a single list

CONCAT short tall allmen

and do the same for the numbers in the real-world groups

SUBTRACT allmen expected diff
SQUARE diff diffsq
DIVIDE diffsq expected chi

the operations above result in the chi-square for the real-world data

SUM chi chi_sq
PRINT chi_sq

we should have calculated this in advance, but this is a cross-check

URN 36#7 15#8  44#9 men

#7 signifies a follower, #8 unclassifiable, #9 a leader

REPEAT 999
  SHUFFLE men men$

randomizing destroys whatever link there was between height and leadership

TAKE men$ 1,43  short$

form a simulated group "short$"

TAKE men$ 44,95  tall$

the rest of the values go into simulated group "tall$"

COUNT short$=7 sf$

how many "short" men were short followers?

COUNT short$=8 su$

how many were short-unclassifiable?

COUNT short$=9 sl$

and how many short leaders?

COUNT tall$=7 tf$

tall followers

COUNT tall$=8 tu$

tall unclassifiable

COUNT tall$=9 tl$

tall leaders

CONCAT sf$ su$ sl$ tf$ tu$ tl$ all$

put these six simulated counts back into a single vector "all$"

SUBTRACT all$ expected diff$

and proceed to compute a chi-square for these data

SQUARE diff$ diffsq$
  DIVIDE diffsq$ expected adjsq$
  SUM adjsq$ chisq$

this time "chisq$" has <$> attached to signify a simulated value

SCORE chisq$ scrboard
END
'HISTOGRAM scrboard
'BOXPLOT scrboard

remove the apostrophe (') to activate either command if you want to see the distribution of results

COUNT scrboard >= 10.9 more

how often did a simulation chi-square at least equal the experimental value?

DIVIDE more 999 prob

convert that value into a proportion of the number of repeat runs

PRINT prob

Results

Frequency histogram of resampled chi-square value

prob = 0.0043 after 10,000 runs

Conclusion

The null hypothesis is rejected. There is an interaction between height and the extent to which men are classified with regard to their leadership qualities. By inspection of the data, we can see that tall men are overrepresented in the leadership category, and short men are overrepresented in the follower category.

References

Siegel, S., & Castellan, N. J., Jr.. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). New York: McGraw-Hill.


Home | Order | Software | Courses | Teaching | Support | About | Search | Contact

Site Design by NEW TARGET
Site Hosted by Hagen Hosting
© 2003 statistics.com LLC
Visit statistics.com