Preserving our values

A quick overview of the P value. Figure from Repapetilto on Wikimedia Commons.

A quick overview of the value. Figure from Repapetilto on Wikimedia Commons.

A psychology journal recently elected to ban the value from its publications.

From the linked Nature piece:

Jan de Ruiter, a cognitive scientist at Bielefeld University in Germany, tweeted: “NHST [null hypothesis significance testing] is really problematic”, but added that banning all inferential statistics is “throwing away the baby with the p-value”.

If the objective is to prevent values from being used as stand-ins for real, repeatable results, why not require some additional significance testing in addition to disproving the null hypothesis? Psychology may have some specific issues* (i.e., small sample sizes, variation across samples, or even good ol' false positives) but much like other areas of study it requires methods to compare experimental results.

There's a temptation to suggest that reviewers should be responsible for determining whether a value is used appropriately. A seemingly-significant value has great power, though, and the average reviewer may not read beyond < 0.05 even if other analyses are presented or even failed. As with nearly every other aspect of publishing, results and their interpretation should be handled on a case-by-case basis without eliminating any options. Abuse of statistics will only get worse without consistent ways to interpret data.

* I'm not a psychologist, have never published in a psychology journal, and rarely speak with anyone practicing psychology research.