Article report time! Today I read this, Five Ways to Fix Statistics, Nature (Nov 2017). It interviews six statistics luminaries about what's "wrong" with modern statistics. Here are some choice quotes (all emphasis mine):
"In the past couple of decades, many fields have shifted from data sets with a dozen measurements to data sets with millions. Methods that were developed for a world with sparse and hard-to-collect information have been jury-rigged to handle bigger, more-diverse and more-complex data sets. No wonder the literature is now full of papers that use outdated statistics, misapply statistical tests and misinterpret results. The application of P values to determine whether an analysis is interesting is just one of the most visible of many shortcomings."
-- Jeff Leek
"In many fields, decisions about whether to publish an empirical finding, pursue a line of research or enact a policy are considered only when results are 'statistically significant', defined as having a P value (or similar metric) that falls below some pre-specified threshold. This approach is called null hypothesis significance testing (NHST). It encourages researchers to investigate so many paths in their analyses that whatever appears in papers is an unrepresentative selection of the data."
"Statistical-significance thresholds are perhaps useful under certain conditions: when effects are large and vary little under the conditions being studied, and when variables can be measured accurately. This may well describe the experiments for which NHST and canonical statistical methods were developed, such as agricultural trials in the 1920s and 1930s examining how various fertilizers affected crop yields. Nowadays, however, in areas ranging from policy analysis to biomedicine, changes tend to be small, situation-dependent and difficult to measure. For example, in nutrition studies, it can be a challenge to get accurate reporting of dietary choices and health outcomes."
"The plethora of options creates a hazard that statistician Andrew Gelman has dubbed the garden of forking paths, a place where people are easily led astray. In the vast number of routes, at least one will lead to a 'significant' finding simply by chance. Researchers who hunt hard enough will turn up a result that fits statistical criteria — but their discovery will probably be a false positive."
"Norms are established within communities partly through methodological mimicry. In a paper published last month on predicting suicidality10, the authors justified their sample size of 17 participants per group by stating that a previous study of people on the autism spectrum had used those numbers. Previous publication is not a true justification for the sample size, but it does legitimize it as a model. To quote from a Berwick report on system change, "culture will trump rules, standards and control strategies every single time" (see go.nature.com/2hxo4q2)."
The tldr is that:
- We're too confident. We hide uncertainty. We overstate claims.
- We do that because of (a) incentive structures around academic publishing, (b) cognitive errors and bias around wanting certainty.
Some of the proposed solutions are:
- Trial registries like ClinicalTrials.gov and the AEA RCT registry (I used to work on this! oh memories).
- Changing norms around p-values for grant funding, article publication, etc.
- Accepting, with cold existential dread, that we live in a probabilistic universe.
This article also made me think about the data science hype/bubble. Data-driven decision making is definitely preferable to, uh, gut-driven decision-making? norms-driven decision making? There's still a lot more data out there to be scienced. So the bubble is, maybe, merited. I mean, I work in the bubble, so please please let's keep this data thing going, okay!? But I don't worry too much. We live in a world where enormous amounts of data are being passively generated, and this is only going to increase (exponentially?) as IoT stuff takes off. The day your WiFi-enabled fridge starts judging you will be a day of much glory for statisticians and machine learningers.
But data literacy is so important, and so under-taught. And if my human brain understands Daniel Kahneman and Amos Tversky's work on human brains, we are also just not great at understanding uncertainty. We want certainty - and positive results! As some colleagues used to say, half-joking, "up and to the right!" But null results are important; counter-intuitive (in a not-sexy, non-"ah ha!" way) results are important.
I feel like, right now, there's a lot of gatekeeping around data science as being something very complex and difficult, and a treatment that it's a black box full of dark magic. It's not: it's mostly linear algebra (which you can learn here). Worryingly, people - even fancy people - still conflate prediction with causation. It's not: it's mostly correlation. Hence machine learning's infamous reputation for perpetuating bias; if you don't care to disentangle why something worked, you're less likely to confront ugly correlations and disentangle them for what they are.
Anyway. Some random thoughts. I definitely recommend Joel Best's Damned Lies and Statistics. It's not very technical, and is instead more interested in the history and culture of stats - something that might be just as important as the technical side.