Few have done more for the study of numbers than Ronald Fisher. He is a sacred figure in the world of statistics and is often described as the most important figure in the development of modern statistical research.
The articles on historyofdatascience.com represent a diverse group of people from a variety of backgrounds, beliefs and historical periods. Their selection on this website is solely based on their contributions to the field of data science.
The views and opinions of people presented and expressed on this website are their own and do not necessarily reflect the values of Dataiku as a company nor do they constitute an endorsement by Dataiku.
If you are concerned by anything on this website, please contact us at alan.turing@dataiku.com
A dispute over tea
In the early 1920s, Fisher was employed at an agricultural research station north of London, where he was tasked with developing ways to improve their experiments. He was taking the customary 4 pm tea break with a coworker, Muriel Bristol, and was taken aback when she refused the cup he offered her because he had poured the milk in before the tea. It tasted much better if the milk went in second, she insisted.
Fisher found her claim preposterous. Another colleague, however, suggested they subject Bristol to a blind taste test to see if she really could tell the difference. Fisher thought that just making one cup in each style would leave too much room for error; Bristol might just get lucky and guess right. So he made eight cups: four milk first, four tea first. To his astonishment, Bristol correctly identified each one.
“Sometimes the only thing you can do with a poorly designed experiment is to try to find out what it died of.”
Fisher did not appear to dwell much on why Bristol was able to taste the difference (there is now a scientific explanation… yet). Instead, it made him think about how the experiment could have been done even better, with even lower risk for error.
In theory, Bristol could have simply been extremely lucky and guessed correctly eight times. The chance of that, he calculated, was 1 in 70. And what if she had guessed correctly six out of eight times? That would suggest that she probably could tell the difference, but there was a 1 in 4 chance that she had simply gotten lucky.
However, if the sample size were increased to 12, Fisher determined, the chance for error would be significantly reduced.
Putting the ideas on paper
It may seem obvious that a larger sample size reduces the chance of error, but at the time, there was not a set of best practices for statistical research. Fisher changed that with his seminal book, Statistical Methods for Research Workers, published in 1925. Among other things, Fisher offered a definition of statistical significance that has since shaped how researchers interpret the findings of a study.
Ten years later he came out with another one, The Design of Experiments, that introduced a number of important concepts, including “null hypothesis” and, in a nod to Bristol, the “lady testing tea experiment.”
Key Dates
-
1919
Doing Critical Research on Crop Cultivation and Genetics
Fisher begins a 14-year stint at the Rothamsted Experimental Station, north of London. His research on crop cultivation and genetics is credited with bringing important innovations to agriculture that boosted crop yields and reduced world hunger.
-
1925
Publishing a Textbook on Best Practices to Reduce Error in Experiments
He publishes Statistical Methods for Research Workers, a seminal textbook on statistical research that proposed best practices on how to reduce error in experiments.
-
1935
Focusing on Other Statistical Concepts like the Null Hypothesis
Fisher publishes The Design of Experiments, which introduces other foundational statistical concepts, notably the null hypothesis.