Resampling Stats

Order
order online
printed form

Software
Excel
Matlab
XLMiner
Downloads

Books, etc. Intro text online
Articles
Bibliographies

Courses Internet course

Teaching Teaching With RS
Teaching Information
What Students Say
What Teachers Say
What Reviewers Say
What Authors Say

Support User Guides
Troubleshooting

About Contact Mailing List

 
About Resampling



| Basic Commands | Probability Puzzles | Hypothesis Test, Count Data | Hypothesis Test, Measured Data | Confidence Interval, Count Data | Confidence Interval, Measured Data | Association / Correlation | Regression | Other Examples |

Quality-2

[Spearman correlation coefficient version; see QUALITY-1 for Pearson coefficient version]

Problem

A group of skin moisturizers was evaluated by a test panel and ranked by estimated quality, "1" being highest quality and "48" the lowest (Consumers Union, 1986, as cited in Noreen, 1989, p. 26). The price per ounce was also calculated from retail prices. One would expect that higher-priced cosmetics would be higher quality. (Since "1" represents the highest quality, we expect a negative correlation, that is, cheaper cosmetics tending to be at the bottom of the quality list.) In QUALITY-1, we examined whether there a significant relationship between price and quality. A highly significant positive correlation of .43 indicated that cheaper cosmetics tended to be higher quality. One possible explanation was that a few high-priced but poor-quality items might have unduly influenced the correlation coefficient. In this version of the problem, we use a rank-order approach to eliminate the influence of extreme values.

Quality-2 Table. Price Per Ounce Of Skin Moisturizers In Order Of Descending Estimated Quality

Rank Price/oz. Rank Price/oz. Rank Price/oz. Rank Price/oz.
1. $1.83 13. $0.28 25. $1.65 37. $3.89
2. 0.23 14. 0.11 26. 3.43 38. 0.17
3. 1.52 15. 0.12 27. 0.59 39. 1.65
4. 1.91 16. 0.12 28. 0.42 40. 0.38
5. 0.25 17. 0.30 29. 0.40 41. 0.45
6. 0.10 18. 0.45 30. 1.56 42. 1.30
7. 0.12 19. 0.24 31. 0.24 43. 3.07
8. 0.24 20. 0.22 32. 0.26 44. 1.42
9. 0.33 21. 0.11 33. 1.69 45. 2.11
10. 0.19 22. 0.25 34. 0.10 46. 6.10
11. 0.26 23. 3.33 35. 0.62 47. 4.29
12. 0.26 24. 1.31 36. 0.25 48. 0.25

Note. Data are from Consumers Union, 1986, as cited in Noreen, 1989, Table 28, p. 26.

We will measure correlation with the Spearman rank correlation coefficient. Convert the original price data into ranks, "1" being lowest price and "48" the highest price. Correlate the rank with the quality ranking, using the Pearson correlation formula. The answer is .41. Could a value this high have resulted from random factors if there were no underlying relationship between price and quality (the null hypothesis)?

Null hypothesis (H0): There is no correlation between quality and price. Alternative hypothesis (H1): There is a correlation between price and quality.

Resampling Procedure

We will simulate random correlations with 48 pairs of rank data, to determine how often a correlation coefficient of .41 could arise by chance.

  1. Write the numbers "1" through "48" on 48 playing cards.
  2. Shuffle the cards and deal them out in one long column.
  3. Correlate the numbers on each card with its position, that is, by a strict sequence from "1" to "48."
  4. Record the simulated correlation coefficient.
  5. Repeat (2-4) 1,000 times. Compute the proportion of trials in which simulated data showed a correlation coefficient as large or larger than found in the observed results.

Computer Implementation in Resampling Stats

DATA 1,48 costrank

we do not need to worry about the actual costs, so simply set up a set of cards with all 48 rank values

DATA 1,48 quality

put numbers "1" through "48" into the vector "quality"

REPEAT 1000

do a simulated correlation with randomized data

SHUFFLE costrank rank$

randomize the price data into a simulated vector "rank$"

CORR rank$ quality corr$

this is one possible correlation coefficient using randomized data

SCORE corr$ scrboard

retain a copy of this simulated correlation on a scoreboard

END
HISTOGRAM scrboard
COUNT scrboard >= 0.41 hits

how often did random shuffling reach the target value?

DIVIDE hits 1000 prob

adjust for the number of repeat loops

PRINT prob

Results

Resampled correlation between price and quality

Outcome of five runs of 1,000 repeats:

prob = 0

prob = 0.001

prob = 0.004

prob = 0.001

prob = 0.002

Conclusion

After converting quantitative information into rankings, the correlation between price rank and quality was .41, which happens to be very close to the correlation between price and quality (.43). This suggests that the price-quality statistic was not influenced by a few extreme observations. We emulated the null hypothesis of no relationship between price and quality (the situation emulated in the RS program). Resampling simulation showed that the probability of getting a correlation at least as large as .41 was between .1% and .5%. We therefore reject the null hypothesis. The correlation was significant, with more expensive cosmetics tending to be of poorer quality.

References

Noreen, E.W. (1989). Computer intensive methods for testing hypotheses: An introduction. New York: Wiley.


Home | Order | Software | Books | Courses | Teaching | Support | About | Search | Contact | Mailing List

Site Design by NEW TARGET
Site Hosted by Hagen Hosting
© 2003 Resampling Stats, Inc.
Contact Resampling Stats