Making Your Research Reproducible with R

Post provided by Laura Graham

tweetReproducible research is important for three main reasons. Firstly, it makes it much easier to revisit a project a few months down the line, for example when making revisions to a paper which has been through peer review.

Secondly, it allows the reader of a published article to scrutinise your results more easily – meaning it is easier to show their validity. For this reason, some journals and reviewers are starting to ask authors to provide their code.

Thirdly, having clean and reproducible code available can encourage greater uptake of new methods. It’s much easier for users to replicate, apply and improve on methods if the code is reproducible and widely available

Throughout my PhD and Postdoctoral research, I have aimed to ensure that I use a reproducible workflow and this generally saves me time and helps to avoid errors. Along the way I’ve learned a lot through the advice of others, and trial and error. In this post I have set out a guide to creating a reproducible workflow and provided some useful tips. Continue reading

Achieving Reproducibility in Research

Earlier this month Leila Walker attended a panel discussion imparting ‘Practical Tips for Reproducible Research’, as part of the Annual Meeting of the Macroecology Special Interest Group (for an overview of the meeting as a whole check out this Storify). The session and subsequent drinks reception was sponsored by Methods in Ecology and Evolution. Here, Leila reports back on the advice offered by the panel members.

For anyone interested in viewing further resources from the session, please see here. Also, you may like to consider attending the best practice for code archiving workshop at the 2016 BES Annual Meeting. Do you have any tips for making your research reproducible? Comment on this post or email us and let us know!

This year’s Annual Meeting of the Macroecology SIG was the biggest yet, with around 75 attendees and even representation across the PhD, post-doc and faculty spectrum. The panel discussion aimed to consider what reproducibility means to different people, identify the reproducibility issues people struggle with, and ultimately provide practical tips and tools for how to achieve reproducible research. Each of the participants delivered a short piece offering their perspective on reproducibility, with plenty of opportunity for discussion during the session itself and in the poster and wine reception that followed.

Attendees enjoy a wine reception (sponsored by MEE) whilst viewing posters and reflecting on the Reproducible Research panel discussion. Photo credit: Leila Walker

Attendees enjoy a wine reception (sponsored by MEE) whilst viewing posters and reflecting on the Reproducible Research panel discussion. Photo credit: Leila Walker

Continue reading

There’s Madness in our Methods: Improving inference in ecology and evolution

Post provided by JARROD HADFIELD

Last week the Center for Open Science held a meeting with the aim of improving inference in ecology and evolution. The organisers (Tim Parker, Jessica Gurevitch & Shinichi Nakagawa) brought together the Editors-in-chief of many journals to try to build a consensus on how improvements could be made. I was brought in due to my interest in statistics and type I errors – be warned, my summary of the meeting is unlikely to be 100% objective.

True Positives and False Positives

The majority of findings in psychology and cancer biology cannot be replicated in repeat experiments. As evolutionary ecologists we might be tempted to dismiss this because psychology is often seen as a “soft science” that lacks rigour and cancer biologists are competitive and unscrupulous. Luckily, we as evolutionary biologists and ecologists have that perfect blend of intellect and integrity. This argument is wrong for an obvious reason and a not so obvious reason.

We tend to concentrate on significant findings, and with good reason: a true positive is usually more informative than a true negative. However, of all the published positives what fraction are true positives rather than false positives? The knee-jerk response to this question is 95%. However, the probability of a false positive (the significance threshold, alpha) is usually set to 0.05, and the probability of a true positive (the power, beta) in ecological studies is generally less than 0.5 for moderate sized effects. The probability that a published positive is true is therefore 0.5/(0.5+0.05) =91%. Not so bad. But, this assumes that the hypotheses and the null hypothesis are equally likely. If that were true, rejecting the null would give us very little information about the world (a single bit actually) and is unlikely to be published in a widely read journal. A hypothesis that had a plausibility of 1 in 25 prior to testing would, if true, be more informative, but then the true positive rate would be down to (1/25)*0.5/((1/25)*0.5+(24/25)*0.05) =29%. So we can see that high false positive rates aren’t always the result of sloppiness or misplaced ambition, but an inevitable consequence of doing interesting science with a rather lenient significance threshold. Continue reading

Towards a More Reproducible Ecology

The following post has been provided by Dr Nick Isaac.

Nick is organising the OpenData and Reproducibility Workshop at Charles Darwin House, London on 21 April 2015 (more information below). He is also an Associate Editor for Methods in Ecology and Evolution.

Macro_finalThe open science movement has been a major force for change in how research is conducted and communicated. Reproducibility lies at the heart of the open science agenda. It’s a broad topic, covering how data are shared, interpreted and reported.

Reproducibility has been advanced by a coalition of publishers (who have been embarrassed by a series of high-profile retractions), funding agencies keen that data should be re-useable after the life of a grant, and young researchers taking a more collaborative attitude than previous generations.

There is now a vast range of tools and platforms to help scientists share data and other materials (e.g. Dryad, Github, Figshare) and to create efficient and reproducible workflows (e.g. Sweave, Markdown, Git and, of course, R). There’s even a MOOC (Massive Open Online Course) in Reproducible Research, run out of Johns Hopkins University.

Ecology has lagged behind wet-lab biology and other disciplines in the adoption of reproducibility concepts and there are few examples of ecological studies that are truly reproducible. To address this, we’re running a one-day workshop at Charles Darwin House, London on Tuesday 21 April entitled OpenData & Reproducibility Workshop: the Good Scientist in the Open Science era. Continue reading