References To Cite For Resampling Methods

References to Cite for Resampling Methods
Resampling methods represent a substantial shift in the way statistical inference is practiced. Still, they are based on literature that, in some cases, predates World War II. The methods themselves did not become popular until computers became widely available. Here are some references that can be cited to support the use of resampling methods.

Permutation Methods

Also called “exact” procedures, permutation methods are based on shuffling or permuting the data. The original standard references to cite are:

  • Fisher, R.A., The Design of Experiments. New York: Hafner; 1935
  • Pitman, E. J. G., Significance tests which may be applied to samples from any population. Royal Staistical Society Supplement, 1937; 4: 119-130 and 225-32 (parts I and II).
  • Pitman, E. J. G., Significance tests which may be applied to samples from any population. Part III. The analysis of variance test. Biometrika, 1938; 29: 322-35.
  • Dwass, M. Modified randomization tests for non-parametric hypotheses. Ann. Math. Statist. 1957; 28: 181-187

Fisher described several permutation tests, including one in which the sizes of plants grown from one population of seeds were compared to the sizes of plants grown from a second population of seeds, using paired pots. Fisher determined the significance of the result by systematically permuting the results across pairs of pots, and thus generating a reference distribution under the null hypothesis of no difference between populations. Fisher also described the famous “Tea Taster” who claimed to be able to tell whether a cup of tea had the milk poured first, or the tea poured first. Her results in identifying which 4 cups were milk-first and which 4 were tea-first were compared to a reference distribution of all the possible ways in which 8 random guesses might have been made.

Pitman, in a 3-part series, built an explicit framework for statistical inference based on permutation tests, demonstrating that the inferences concerning statistical significance do not depend on distributional assumptions for the populations involved.

Both Fisher and Pitman presented methods that involved enumeration of all possible permutations of the data. Dwass suggested that randomly selected permutations (i.e. repeatedly randomly shuffling the data) could do the job, too.

Bootstrap Methods

Bootstrap methods typically involve drawing samples with replacement from the original data, or from an appropriate model, usually for purposes of making inferences about how samples behave when drawn from a population.

References to cite include:

  • Efron, B. Bootstrap methods: another look at the jacknife. Annals Statist., 1979; 7: 1:26
  • Efron, B. The Jacknife, the Bootstrap and Other Resampling Plan.. Philadelphia: SIAM; 1982
  • Simon, J. L. Basic Research Methods in Social Science. New York: Random House; 1969

Efron coined the term “bootstrap” in 1979, and since then the procedure has become widely used, particularly in cases where no analytical (or no easy analytical) method exists for determining the sampling distribution of a statistic or estimate. Simon presented what would now be called a bootstrap example as part of a compendium of examples in 1969.

Additional Sources

The above references illuminate the inception of these methods. Since both permutation and bootstrap methods are flexible and powerful, a substantial body of literature has developed around their theory and application. A comprehensive bibliography can be found at /bibliographies.html. In addition, Michael Chernick’s Bootstrap Methods: A Practitioner’s Guide contains an 83 page bibliography. Here are several other books that provide useful surveys of resampling methods:

Davison & Hinkley, Bootstrap Methods and Their Application
Efron, An Introduction to the Bootstrap
Good, Resampling Methods: A Practical Guide to Data Analysis
Lunneborg, Data Analysis by Resampling
Simon, Resampling the New Statistics (entire book available online)