I was a fourth year graduate student when I first had the idea to make an R package. Quite a few people thought it was a bit silly, or a bit of a time-waste, but I thought it was the right thing to do at the time, and I think it has proven to be the right decision in hindsight.
A very important ecological feature of a species is its geographic range, which can be described by its size, position and shape. Studying the geographic range can be useful to understand the ecological needs of a species and, thereby, to plan conservation strategies. In ecological studies, mathematical models are the new standard to reconstruct the distribution of living species on Earth because of their accuracy in predicting a species presence or absence at unsampled locations. These methods are able to reconstruct the climatic niche of a species and to project it onto a geographic domain in order to predict the species’ spatial distribution. To do this, besides the occurrences of a species, the models necessarily require the spatial maps of environmental variables, like temperature and precipitation, for all the study area.
In our recent paper in Methods in Ecology and Evolution, Alessandro Lúcio and I describe a new R package, metan, for multi-environment trial analysis. Multi-environment trials are a kind of trial in plant breeding programs where several genotypes are evaluated in a set of environments. Analyzing such data requires the combination of several approaches including data manipulation, visualization and modelling. The latest stable version of metan (v1.5.1) is now on CRAN. So, I want to share the history about my first foray into using R, creating an R package, and submitting a paper to a journal that I’ve never had submitted before.
Em nosso recente artigo na Methods in Ecology and Evolution, Alessandro D. Lúcio e eu descrevemos um novo pacote R para análise de ensaios multi-ambientes chamado metan. Ensaios multi-ambientes são um tipo de ensaio em programas de melhoramento de plantas, onde vários genótipos são avaliados em um conjunto de ambientes. A análise desses dados requer a combinação de várias abordagens, incluindo manipulação, visualização e modelagem de dados. A versão estável mais recente do metan (v1.5.1) está disponível agora no repositório CRAN. Então, pensei em compartilhar a história da minha primeira incursão no uso do R criando um pacote e submetendo um artigo para uma revista que nunca havia submetido antes.
You can find out more about our Featured Articles (selected by the Senior Editor) below. We also discuss this month’s Open Access and freely available papers we’ve published in our latest issue (Practical Tools and Applications articles are always free to access, whether you have a subscription or not) .
A warning:Halloween is nigh, and the following post contains graphic real-life imagery of maggot-eaten eye-sockets and deadly pianos. Read on… if you dare!
A Death in the Woods
In the vast and often frozen boreal forest of northern Canada there is a slow-burning forensic investigation into a death. The victim: a woodland caribou, an iconic species that is threatened or endangered throughout its range.
The scene is very much made for TV neo-Scandinavian neo-noir. From a not-too-luxurious regional office in the town of Fort Smith, just north of the Alberta border, over a steaming cup of coffee, world-weary biologist Allicia Kelly – who’s seen it all and then some – is monitoring the movements of collared animals on her computer screen. It’s the middle of May. The females, nearly all pregnant, are scattering to higher ground to find suitably cozy and secluded sites to calve. All is as peaceful and idyllic as a bunch of blips on a computer screen can be.
But then (cue slightly unsettling dissonance in the soundtrack) one of the little blips seems to have stopped moving. Kelly raises her eyebrow, tells herself to keep an eye out. A moment later she makes the call: “Team, we’ve got another ringer … let’s roll!” Continue reading →
As environmental managers, we’re frequently asked to make judgements about the relative health of the environment. This is often difficult because, by its nature, the environment is highly variable in space and time. Ideally, such judgements should be informed by robust scientific investigation, or more precisely, the reliable interpretation of the resulting data.
Type I and Type II Errors
Even with robust investigations and good data, our interpretations can sometimes be wrong. In general, this happens when:
the investigation concludes that an impact has occurred, when in fact it hasn’t (Type I error)
fails to detect an impact, when an impact has actually occurred (Type II error).
Understanding the circumstances that lead to these errors is unfortunately complicated, and difficult unless you have a strong statistical background. Continue reading →
The number of studies published every year in ecology and evolutionary biology has increased rapidly over the past few decades. Each new study contributes more to what we know about a topic, adding nuance and complexity that helps improve our understanding of the natural world. To make sense of this wealth of evidence and get closer to a complete picture of the world, researchers are increasingly turning to systematic review methods as a way to synthesise this information.
What is a Systematic Review?
Systematic reviews, first developed in public health fields, take an experimental design approach to reviewing the literature. They treat the search for primary studies as a transparent and reproducible data gathering process. The rigorous methods used in systematic reviews make them a trusted form of evidence synthesis. Researchers use them to summarise the state of knowledge on a topic and make policy and practice recommendations. Continue reading →
In our recent publication (Rabosky et al. 2018) we assembled a huge phylogeny of ray-finned fishes: the most comprehensive to date! While all of our data are accessible via Dryad, we felt like we could go the extra mile to make it easy to repurpose and reuse our work. I’m pleased to report that this effort has resulted in two resources for the community: the Fish Tree of Life website, and the fishtree R package. The package is available on CRAN now, and you can install it with:
The source is on GitHub in the repository jonchang/fishtree. The manuscript describing these resources has been published in Methods in Ecology and Evolution (Chang et al. 2019).
Modelling species distributions involves relating a set of species occurrences to relevant environmental variables. An important step in this process is assessing how good your model is at figuring out where your target species is. We generally do this by evaluating the predictions made for a set of locations that aren’t included in the model fitting process (the ‘testing points’).
Random splitting of the species occurrence data into training and testing points
The normal, practical advice people give about this suggests that, for reliable validation, the testing points should be independent of the points used to train the model. But, truly independent data are often not available. Instead, modellers usually split their data into a training set (for model fitting) and a testing set (for model validation), and this can be done to produce multiple splits (e.g. for cross-validation). The splitting is typically done randomly. So testing points sometimes end up located close to training points. You can see this in the figure to the right: the testing points are in red and training points are in blue. But, could this cause any problem? Continue reading →