There is an abundance of literature on interpretation of experimental results but it seems that many homebrewers continue to ignore this literature and misinterpret experimental results.
EXAMPLES OF MISINTERPRETATION
As examples of this misinterpretation that motivated me to write this topic:
- Stan Hyeronimus recent article on First Wort Hopping
- A friend commenting on a post that there is no difference between step mashing and infusion mashing (and sending a link to the Brulosophy experiment on the subject)
I will do my best to avoid scientific jargon in my discussion below...
A positive result is when the investigator reports a statistically significant difference (commonly 95% or higher probability) between the treatments.
Statistical outcome: under the experimental conditions proposed (very important caveat) it is highly likely the treatments are different.
Practically it means that it is likely that (if you reproduce the design and use the same beer style) your beer will be different with one vs. the other treatment.
NEGATIVE / NULL RESULTS
A negative or null result is when the investigator reports that there is NO statistically significant difference (commonly 95% or higher probability) between the treatments.
Statistical outcome: None. No statistical conclusions can be drawn.
Practically it means that you should ignore the results until further information is collected.
REASONS FOR NEGATIVE / NULL RESULTS
- A combination of the below
- The treatment studied has no or minimal effect on beer (this is what most people think it means)
- The sample size chosen to test the beers was too small, and if larger, they would have detected a difference (insufficient power in stats jargon)
- Beer style tested not the correct style for the experiment
- Experimental design had one or more imperfections. This includes not only the brewing ingredients and process itself but also the testing conditions.
- Experimental design was not correctly executed. (do not take offense, all investigators must consider this possibility)
- Beer quality not very good, confounding the experimental variable (do not take offense, all investigators must consider this possibility) (this is related to both experimental design and its execution).
- Random error (aka chance)
WRONG INTERPRETATION OF NEGATIVE/ NULL RESULTS
Many brewers keep interpreting null results as “The treatment studied has no or minimal effect on beer”. This is not correct. Any of the listed reasons and at various “weights” could lead to negative/ null results.
SHOULD NEGATIVE/ NULL RESULTS BE PUBLISHED
The answer is almost always YES. The only caveat is that the experimental design must be correctly executed. If the investigator aimed for two 1.050 OG worts, and one of the worts ended at 1.040 unexpectedly, the investigator must repeat the experiment. If it happens twice or more times, then the investigator may be in the presence of an unexpected finding that warrants further study.
CAN SIMILARITY / EQUIVALENCY BE STATISTICALLY TESTED?
The answer is theoretically yes, but the experiment design would be much more complex, time consuming and costly. Because even if proven statistically, results would only apply to the experimental design tested, there is no valid rationale to design equivalency experiments. Experimental designs whose statistical goal is to reject the null hypothesis are much simpler.
SUGGESTIONS FOR IMPROVEMENT
On experimental design, the best suggestion for beer investigators is to perform a thorough literature search. It will not improve our understanding of a question, to design experiments without knowing previous experimental designs, their successes and flaws; it will just create confusion.
On interpretation of experiments, please refer to the REASONS FOR NEGATIVE / NULL RESULTS. Do not over interpret results.