Research methods in the twenty-first century¶
We neuroscientists live in interesting times.
There has been much recent discussion of the possibility that science as currently practiced and published has a high rate of error. A 2015 study  found that only 39% of a partially-random sample of cognitive and social psychology articles could be replicated. Articles that were more complicated to replicate or that had more surprising results had lower replication rates. Ioannidis  has argued that the false positive rate will be higher in expensive, fashionable fields with small sample sizes and complex analysis – features of neuroscience techniques such as neuroimaging. For example, the replication rate of associations between candidate genes and diseases or disease risk factors was around 1.2% .
How did this high rate of false positives come about? How can we make published science more likely to be true?
I believe that another major factor is the increasing gap between the training we give our students and the needs of disciplined scientific practice.
What is disciplined scientific practice? It takes two forms. The first could be called “best practices in scientific computing”. There is general agreement about what this means, and it includes standard use of version control, automated testing, and peer review of scientific code . These are all techniques than can be shown to reduce error and improve collaboration. Like many other skills, they are best taught by practice with expert feedback, but they are rarely so taught.
The second aspect of disciplined scientific practice is related; to have a process that teaches steadfast skepticism of our own work and that of other scientists.
There are many factors that make skepticism more difficult in modern science. One is that there is a large gap between the methods that we teach and the methods used in many recent papers. Standard neuroscience and psychology courses teach t-tests, correlation and ANOVA using the methods pioneered by Fisher in the 1920s . These teaching methods work well if the student will only be using t-tests, ANOVA or correlation, but give them little help in understanding such standard methods as Principal Components Analysis, constrained regression and other techniques grouped under the heading of “machine learning” . To give some idea of the gap that has opened up between the language of statisticians and psychologists, a recent textbook called “All of statistics: a concise course on statistical inference”  does not have the terms “ANOVA” or “Analysis of variance” in its index.
A second barrier to skepticism is that our methods courses have come to rely heavily on off-the-shelf statistical packages such as SPSS and the R computing language. Outside a narrow range of analysis, the student is unlikely to know what calculations these programs are using, making it easy for them to give in to uncritical belief in their output – a phenomenon known as garbage-in, gospel-out. By teaching in this way we make it more likely that students will get used to the dangerous habit of feeding complex data through canned analysis pipelines and interpreting output that they do not fully understand.
The way to overcome both these barriers to skepticism is for us to teach our students in a different way. It is important for the students to understand what calculations these packages are doing and why. To quote Richard Feynman: “What I cannot create, I do not understand”. Luckily we now have many good tools to teach this kind of computational thinking. One major factor in making computing easier to teach has been the increasing use of the Python programming language for scientific computing. Python is a language so well adapted to teaching that it is the standard starting language for teenagers and computer science undergraduates. Using Python and the many tools that have grown up around it, it is now possible to combine teaching of best computational practice, basic mathematical ideas and interactive computing into a single coherent course. With this background, the students learn to criticize, collaborate and improve, while making it easier for their peers to do the same.
I believe that courses that teach like this are the only practical way to prepare our students to do sound, transparent and reproducible science in the modern world of data and analysis.
|||Open Science Collaboration and others. Estimating the reproducibility of psychological science. Science, 349(6251):aac4716, 2015.|
|||Edward L Deci, Richard Koestner, and Richard M Ryan. A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological bulletin, 125(6):627, 1999.|
|||Ronald Aylmer Fisher. Statistical methods for research workers. Genesis Publishing Pvt Ltd, 1925.|
|||Trevor J.. Hastie, Robert John Tibshirani, and Jerome H Friedman. The elements of statistical learning: data mining, inference, and prediction. Springer, 2009.|
|||John PA Ioannidis. Why most published research findings are false. PLoS medicine, 2(8):e124, 2005. doi:10.1371/journal.pmed.0020124.|
|||John PA Ioannidis, Robert Tarone, and Joseph K McLaughlin. The false-positive to false-negative ratio in epidemiologic studies. Epidemiology, 22(4):450–456, 2011.|
|||Alfie Kohn. Punished by reward. The Trouble with Gold Stars. Incentive Plans, A’s, Praise, and other Bribes, Boston, 1993.|
|||Brian A Nosek, Jeffrey R Spies, and Matt Motyl. Scientific utopia ii. restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6):615–631, 2012.|
|||Larry Wasserman. All of statistics: a concise course in statistical inference. Springer Science & Business Media, 2013.|
|||Greg Wilson, DA Aruliah, C Titus Brown, Neil P Chue Hong, Matt Davis, Richard T Guy, Steven HD Haddock, Katy Huff, Ian M Mitchell, Mark D Plumbley, and others. Best practices for scientific computing. PLoS biology, 12(1):e1001745, 2014.|