CHAPTER I-1 THE STATISTICAL RESULTS OF RESAMPLING INSTRUCTION EVALUATIONS OF TEACHING INTRODUCTORY STATISTICS VIA RESAMPLING Julian L. Simon and Peter C. Bruce SUMMARY Controlled experiments in the 1970s found better classroom results for resampling than for conventional methods, even without computers. Students handled more problems correctly, and liked statistics much better, with resampling. 1990's surveys of student judgments of courses using the resampling method - introductory classes and graduate students, from Frederick Junior College in Maryland to Stanford University Graduate School in California, and abroad - show that students approve the method. They say they learn from it, would recommend a course taught with the resampling method, find it interesting, and use what they learned in years after they finish the course. These results should constitute a prima facie case for at least trying out resampling in a wide variety of educational settings. But more empirical study would be valuable. INTRODUCTION The introductory statistics course is troublesome. Many readers will surely confirm that assertion with their own knowledge of what students and teachers say about the subject. And there is much written testimony to this effect by thoughtful critics of statistics education. Garfield (1991) summarizes: "A review of the professional literature over the past thirty years reveals a consistent dissatisfaction with the way introductory statistics courses are taught" (p. 1). Garfield asserts (referring to her dissertation, 1981, and to work by Wise) that "It is a well known fact that many students have negative attitudes and anxiety about taking statistics courses" (p. 1). "Students enrolled in an introductory statistics course have criticized the course as being boring and unexciting... Instructors have also expressed concern that after completing the course many students are not able to solve statistical problems... (1981, quoting Duchastel, 1974). Teachers of statistics have responded by trying a wide variety of devices to mitigate the problem, and many of them certainly can be valuable. But nothing has availed as a general solution. RESAMPLING AND THE TEACHING OF STATISTICS Resampling, especially the bootstrap method, has been hailed as a major breakthrough - the only one since 1973 listed in the Breakthroughs in Statistics volumes (Kotz and Johnson, 1992). But as yet the method has been little taught in conventional texts and classes. We shall also mention in the Comments section some other effects on the teacher and the teaching process. THE STUDY METHODS The sources of data reported here are several: 1) In 1973, Simon taught both the standard method and the resampling method to a class of second-year and third-year university students at the University of Illinois, drawn from a great many social-science disciplines. All problems that were treated by the resampling method in class were also demonstrated by analytic methods, whereas many problems were solved by analytic methods that were not treated in class by the resampling method. Therefore, analytic methods had a large advantage over the resampling method in student time and attention, both in reading and in class. On the final exam, there were four questions that the student could choose whether to answer by analytic methods or by resampling. (The studies of Simon, and of Atkinson and of Shevokas mentioned next, were reported earlier in their joint publication, 1976). 2. Carolyn Shevokas studied junior college students who had little aptitude for mathematics. She taught the resampling approach to two groups of students (one with and one without computer), and taught the conventional approach to a "control" group. She then tested the groups on problems that could be done either analytically or by resampling. 3. David Atkinson taught the resampling approach and the conventional approach to matched classes in general mathematics at a small college. The studies listed above by Simon, Atkinson, and Shevokas were conducted without an interactive computer program and the personal computer (though with some experimentation on the mainframe computer using a precursor of Resampling Stats). Resampling benefits greatly from the use of computers, and hence the results of these studies may be considered to be less favorable than would have been obtained at present. On the other hand, there was the possibility of upward bias in results due to experimenters wanting to see resampling be successful. 4. In the 1990s, a set of evaluative questions were asked of students who were taught the resampling method to a greater or less extent to our classes at the University of Maryland, College Park, and to classes of Martin Kalmar at Frederick Community College (Maryland), Marvin Zelen at Harvard University, James Higgins at Kansas State University, John Emerson and Sara Cairns at Middlebury College, Robert Cornell at the Milton Academy, Aaron Ellison at Mt. Holyoke College (Massachusetts), Chris Ricketts (Ricketts and Berry, 1994) at the University of Plymouth (England), Simcha Pollack at St. John's University (New York), Paul Switzer at Stanford University, W. I. Seaver at the University of Tennessee (Knoxville), Alan Garfinkel at UCLA, H. Charles Romesburg at Utah State University, and Cliff Lunneborg at the University of Washington (Seattle). These surveys are not controlled experiments. But the data were gathered by instructors who originally had no stake in the method other than the desire to use their class time to best advantage, and hence there is little ground for worry about survey bias. 5. We conducted a small follow-up study of students who had completed resampling statistics as their introductory statistics course at the University of Maryland from ten to thirty months prior to the survey, and we also obtained a comparison sample of students who had taken the same course taught with the conventional method only. THE RESULTS 1. Can Beginning Students Produce Correct Answers? The ultimate aim of statistics education should be what we may call "statistical utility", to enable students to deal sensibly with realistic statistics problems, with full understanding of what they are doing (which leads to correct procedures). So we want to know whether the combination of the resampling method and its instruction can produce such a result. a) Simon's early-1970s classroom experiments showed that students successfully produce correct answers to problems in probability and statistics with this method. The choices of methods by the students gives an indication of the usefulness of the resampling method. These were the results: i) Almost every student used the resampling method for at least one question. In total, almost half of the answers given were done by the resampling method (41 of 84). ii) There is a propensity slightly greater than chance for the students who did better on the examination as a whole to do a larger proportion of problems by the resampling method. iii) Analytic and resampling methods were both used on each question by some students. iv) The grades that the students received were somewhat higher on the questions answered with the resampling method than on those questions answered with analytic methods. b. Shevokas' students taught with the resampling method were able to solve more than twice as many problems correctly as students who were taught the conventional approach. c. Atkinson's students who learned the resampling method did better on the final exam with questions about general statistical understanding. They also did much better solving actual problems, producing 73 percent more correct answers than the conventionally-taught control group. These experiments are strong evidence that students who learn the resampling method are able to solve problems better than are conventionally taught students. 2) Can the Method Be Learned Rapidly? a) As early as junior high school, students taught by a variety of instructors, and in languages other than English, have in the matter of six short hours learned how to handle problems that students taught conventionally do not learn until advanced university courses. Gideon Keren successfully taught the resampling approach for just six hours to 14- and 15-year old high school students in Jerusalem. And Simon taught the method to juniors and seniors in the select university high school with great success (see Simon with Holmes, 1969). b) In Simon's first university class, only a small fraction of total class time -- perhaps an eighth -- was devoted to the resampling method as compared to seven-eighths spent on the conventional method. Yet, the tested students learned to solve problems more correctly, and solved more problems, with the resampling method than with the conventional method. This suggests that resampling is learned much faster than the conventional method. c) In the Shevokas and Atkinson experiments the same amount of time was devoted to both methods but the resampling method achieved better results. In those experiments learning with the resampling method is at least as fast as the conventional method, and probably considerably faster. 3. Is the Resampling Method Interesting and Enjoyable to Learn? a) Shevokas asked her groups of students for their opinions and attitudes about the section of the course devoted to statistics and probability. The attitudes of the students who learned the resampling method were far more positive -- they found the work much more interesting and enjoyable -- than the attitudes of the students taught with the standard method. And the attitudes of the resampling students toward mathematics in general improved during the weeks of instruction while the attitudes of the students taught conventionally changed for the worse. Shevokas summed up the students' reactions as follows: "Students in the experimental (resampling) classes were much more enthusiastic during class hours than those in the control group, they responded more, made more suggestions, and seemed to be much more involved". b) Gideon Keren told high school students in Jerusalem that that they would not be tested on this material. Yet Keren reported informally that the students were very much interested. Between the second and third class, two students asked to join the class even though it was their free period! And as the instructor, Keren enjoyed teaching this material because the students were enjoying themselves. c) Atkinson's resampling students had "more favorable opinions, and more favorable changes in opinions" about mathematics generally than the conventionally-taught students, according to an attitude questionnaire. And with respect to the study of statistics in particular, the resampling students had much more positive attitudes than did the conventionally-taught students. d) Lines 1 and 2 in Table 1 show that students find the course more interesting and less frightening than they had expected. We do not have comparable data for students taught with conventional methods (because we have not been able to obtain access to these students). Yet these results (and others in the table) seem quite incompatible with the complaints quoted in the introduction. And the instructors at other institutions whose results are reported in Table 1 express a similar view. Ricketts and Berry (1994) reported "very positive responses from the students [taught resampling], especially the less mathematically able". They found that resampling was "highly acceptable to students with a range of mathematical abilities". Table 1 4. Do Students Assess the Learning Experience Positively? a) Lines 3-5 in Table 1 show favorable attitudes toward the courses taught using the resampling method. b) Table 2 shows the results of ex-students of introductory statistics - both those taught with the resampling approach and those taught with the conventional method - to whom we mailed a questionaire 10-30 months after completing the courses. We asked how much they thought they had learned in the courses, how much they retained, how valuable they consider the study of statistics, and their use of statistics at work or in private life. A much larger proportion of students taught with the resampling method responded positively to all these questions than did the students taught with the conventional method. The very large differences are especially impressive because educational experiments commonly show small differences among treatments (which is why educational psychologists are such heavy users of statistical techniques that identify small differences). These differences - ratios of 1:2 verus 2:1 in positive:negative responses - need no statistical test to prove significance. There are many non-comparable aspects of these two treatments, including the fact that many of the conventionally- taught students were part of a very large class. But as a first approach to such a comparison, the results certainly are provocative, especially when taken together with the other results presented here. Table 2 DISCUSSION 1. The reason that resampling works so well in the classroom is that it allows students to escape from the formalism of algebra while having full understanding of what they are actually doing. Resampling escapes from the trap described so vividly by Kempthorne: ...there has been a failure in the teaching of statistics that originates with a failure of the teaching of teachers of statistics,...Part of the malaise that I see occurs, I believe, because it is easy to think of counting and of areas and volumes, so rather than teach something about statistics, one takes the easy route of teaching a species of mathematics. And one can get a partial justification because this species of mathematics is a critical part of the whole area. What must happen is that the ideas and aims of statistics must determine the mathematics of statistics that is taught and not vice versa. Mathematics is surely a beautiful art form (in addition to being useful). If the statistics that is taught is to have this good form, then its form is determined by its mathematical form. And then, I suggest, form wins out over content, and essential ideas of statistics are lost...(1980, p. 19) 2. We are, of course, aware of research flaws of our survey data (though not the experimental data) including: 1) the colleges and universities from which we have obtained data are not a random sample of all statistics students, 2) the instructors are not compared with others by a random-selection procedure, and 3) we have no control data for these surveys, and that there are many other problems. We are also aware that our follow-up survey has large self-selection problems. But we believe that until shown to be fallacious, this body of evidence is much better than none, especially since it is corroborated by the careful controlled experiments reported in Simon, Atkinson, and Shevokas. We also note here that no one has presented any evidence contradicting our data here, or - to our knowledge - any systematic data at all on any other teaching approaches. We hope that critics will take this into account before dismissing our data for its research flaws - and that they will keep in mind that all research has holes in it. We also remind readers of our wager offer, originally made by Simon alone and now made by Bruce, too. To confront the opposition to these new ideas with a challenge that dramatises the power of the resampling method, we offer to wager $5000 that after just six hours of instruction and practice in resampling, people will produce more correct answers to realistic problems than after 12 or 18 hours of conventional instruction. 3. Because of the nature of the simulation process, it has been possible to create an automatic tutor that checks whether the student gets the right result. If a wrong result is indicated, the tutor detects where in the program the student's logic has gone wrong. The computer tutor then informs the student of this error, and the student can correct the error. A preliminary version of this tutor is available from the authors. 4. Following on the automatic tutor, there also is a semi- automated examination grading plan that works as follows: In the exam room (without computers), the student writes a program, and leaves a carbon copy with the instructor. Later, the student runs the program on a computer. If the tutor program tells the student that the answer is wrong, the student than corrects the program, marks errors on the exam copy that he/she has taken home (the carbon copy the teacher holds prevents cheating - change by the student without so indicating the change), and hands in the corrected exam copy plus computer printoout. This self-checking method gives the students immediate feedback and of course greatly reduces the burden of grading exams. 6. Then-president of the American Statistical Association Arnold Zellner wrote that I challenged participants to design and perform controlled experiments to show that their proposed solutions to the suffering problem [in statistical education] actually work. Perhaps you can do society a service by developing the methodology and showing that your resampling approach produces significantly less suffering and more statistical educational value than other approaches. Such scientific, positive approaches would be extremely valuable, in my opinion. (Correspondence, November 12, 1990) In the spirit of Zellner's remark, we hope that the results we provide here are at least sufficiently strong to be worth further testing and replication. We invite other teachers and researchers in resampling to join with us in cooperative testing of these methods, and in joint publication of the results. We will be happy to make available a wide range of materials to further such research. Ultimately, a shift this great in practice and education will require very broad body of proof of success. No one piece of research can be conclusive, but all can contribute. We also invite suggestions for improvements in methods of inquiry into the issue at hand - better questions to ask, and additional ways to evaluate the data, among other things. SUMMARY AND CONCLUSIONS As far back as the 1970s, controlled experiments (Simon, Atkinson, and Shevokas, 1976) found better classroom results for resampling methods than for conventional methods, even without the use of computers. Students handle more problems correctly, and like statistics much better, with resampling than with conventional methods. The experiments comparing the resampling method against conventional methods also showed that students enjoy learning statistics and probability this way. Recent surveys of student judgments of courses using the resampling method - including both introductory classes and graduate students in statistics, and taught at places ranging from Frederick Junior College to Stanford University Graduate School - show that students approve the method. They say they learn from it, would recommend a course taught with the resampling method, find it interesting, and use what they learned in years after they finish to course. The current students, like the students in the 1970s experiments, do not show the panic about this subject often shown by other students. This contrasts sharply with the less positive reactions of students learning by conventional methods, even when the same teachers teach both methods in the experiment. These results should constitute a prima facie case for at least trying out resampling in a wide variety of educational settings. But more empirical study would be welcome. The statistical utility of resampling and other methods is an empirical issue, and the test population should be non- statisticians. REFERENCES Barlow, Roger, Statistics (New York: Wiley, 1989). Dallal, Gerard E., "Statistical Computing Packages: Dare We Abandon Their Teaching to Others?", in The American Statistician, November 1990, Vol. 44, No. 4, p. 265-6. Dowdy, Shirley and Stanley Wearden, Statistics for Research, Wiley & Sons, 1991. Duchastel, P. C., "Computer Applications and Instructional Innovation: A Case Study in the Teaching of Statistics", International Journal of Mathematical Education in Science and Technology, Vol 5, 1974, pp. 607-616, cited in Garfield, 1981. Freedman, David, Robert Pisani, Roger Purves, and Ani Adhikari, Instructor's Manual for Statistics (second edition) (New York: Norton, 1991). Garfield, Joan, "An Investigation of Factors Influencing Student Attainment of Statistical Competence", PhD Thesis, U. of Minnesota, 1981 Garfield, Joan, "Reforming the Introductory Statistics Course," paper presented at the American Educational Research Association Annual Meeting, Chicago, 1991. Garfield, Joan, and Andrew Ahlgren, "Difficulties in Learning Basic Concepts in Probability and Statistics: Implications for Research", Journal for Research in Mathematics Education, Vol. 19, No. 1, 1988, pp. 44-63. Hey, John D., Data in Doubt: An Introduction to Bayesian Statistical Inference for Economists (Oxford: Martin Robertson, 1983). Hogg, Robert V., , "Statistical Education: Improvements are Badly Needed", The American Statistician, vol. 45, Nov. 1991, 342-343. Hotelling, H., "The Teaching of Statistics," Annals of Mathematical Statistics, 11, 1940: 457-72. Kempthorne, Oscar, "The Teaching of Statistics: Content Versus Form," The American Statistician, February 1980, vol. 34, no. 1, pp. 17-21. Kotz, Samuel, and Norman L. Johnson (eds.), Breakthroughs in Statistics, Volumes I and II (New York: Springer-Verlag, 1992). Moore, Thomas L., and Rosemary A. Roberts, "Statistics at Liberal Arts Colleges," The American Statistician, May 1989, vol. 43, no. 2, pp. 80-85. Ricketts, Chris, John Berry. "Teaching Statistics Through Resampling", Teaching Statistics, 16, #2, Summer, 1994, pp. 41- 44. Ruberg, Stephen J., Biopharmaceutical Report, Vol 1, Summer, 1992. Scheaffer, Richard L., "Toward a More Quantitatively Literate Citizenry", The American Statistician, 44, February 1990, p. 2. Simon, Julian L."A Really New Way to Teach (and Do) Probability Statistics," The Mathematics Teacher, Vol. LXII, April, 1969, pp. 283-288. Simon, Julian L., David T. Atkinson and Carolyn Shevokas, "Probability and Statistics: Experimental Results of a Radically Different Teaching Method," The American Mathematical Monthly, Vol. 83, November, 1976, pp. 733-739. Singer, Judith D. and John B. Willett, "Improving the Teaching of Applied Statistics: Putting the Data Back Into Data Analysis," The American Statistician, August 1990, vol. 44, no. 3, pp. 223-230. Vaisrub, Naomie, Chance, Winter, 1990, p. 53. ENDNOTES page # statwrk4 evalstt2 4-23-6997