In our recent paper in Methods in Ecology and Evolution, Alessandro Lúcio and I describe a new R package, metan, for multi-environment trial analysis. Multi-environment trials are a kind of trial in plant breeding programs where several genotypes are evaluated in a set of environments. Analyzing such data requires the combination of several approaches including data manipulation, visualization and modelling. The latest stable version of metan (v1.5.1) is now on CRAN. So, I want to share the history about my first foray into using R, creating an R package, and submitting a paper to a journal that I’ve never had submitted before.
Em nosso recente artigo na Methods in Ecology and Evolution, Alessandro D. Lúcio e eu descrevemos um novo pacote R para análise de ensaios multi-ambientes chamado metan. Ensaios multi-ambientes são um tipo de ensaio em programas de melhoramento de plantas, onde vários genótipos são avaliados em um conjunto de ambientes. A análise desses dados requer a combinação de várias abordagens, incluindo manipulação, visualização e modelagem de dados. A versão estável mais recente do metan (v1.5.1) está disponível agora no repositório CRAN. Então, pensei em compartilhar a história da minha primeira incursão no uso do R criando um pacote e submetendo um artigo para uma revista que nunca havia submetido antes.
How organisms adapt to the environment they live in is a key question in evolutionary biology. Genetic variation, i.e. how individuals within populations differ from each other in terms of their DNA, is an essential element in the process of adaptation. It can arise through different mechanisms, including DNA mutations, genetic drift, and recombination.
Differences in DNA sequences between individuals can results in differences in the expression of genes. This can therefore determine the organism’s capacity to grow, develop, and react to environmental stimuli. However, a growing body of literature reveals that there are other ways organisms can change the way they interact with the world without mutations in the DNA sequence.
Imagine that you want to catalogue all of the biodiversity (all of the living organisms) from a particular location; how many trained experts would that require? How many person hours would it take to collect and identify all of the rare, well-disguised, and microscopic organisms? How many of these organisms would have to be removed from the environment and taken back to a lab for taxonomic analysis.
Although there is no substitute for human expertise, we have begun using the traces of DNA that organisms leave behind (e.g. excretions, skin and hair cells) in the environment to catalogue biodiversity. These traces of DNA, referred to as environmental DNA, can persist in the environment for minutes or can persist for centuries depending on where they end up. This field of environmental DNA (eDNA) is rapidly becoming an effective tool to complement surveys of biodiversity, both past and present.
There are many reasons that we might be interested in whether individuals, species or populations overlap in multidimensional space. In ecology and evolution, we might be interested in climatic overlap, morphological overlap, phenological or biochemical overlap. We can use analyses of overlap to study resource partitioning, evolutionary histories and palaeoenvironmental conditions, or to inform conservation management and taxonomy. Even these represent only a subset of the possible cases in which we might want to investigate overlap between entities. Databases such as GBIF, TRY and WorldClim make vast amounts of data publicly available for these investigations. However, these studies require complex multivariate data and distilling such data into meaningful conclusions is no walk in the park.
Today, science extends beyond the research bench or the fieldsite more often than ever before. Scientists are continuously interacting with educators and the general public, and people are reciprocating the interest with a drive to be involved.
With this integration of science and the public, citizen-science efforts to crowdsource information have become increasingly popular (check out Zooniverse, SciStarter, NASA Citizen Science Projects, Project FeederWatch, and Foldit to get involved!). In the birding community, enthusiasts have been observing and recording birds for decades, but now there are methods for immediate data sharing among the community (eBird).
Hackathons have become a regular feature in the data-science world. Get a group of people with a shared interest together, give them data, food, and a limited amount of time and see what they can produce (often with prizes to be won). Translated into the world of academia as research hackathons, these events are a fantastic way to foster collaboration, interdisciplinary working and skills sharing.
The Quantitative Ecology hackathon was an intense day of coding resulting in creative and innovative research ideas using social and ecological data. Teams worked through the day to develop their ideas with support from experts in R, open science and statistics. We ended up with five projects addressing questions from, ‘Who has the least access to nature?’ to ‘Where should citizen scientists go to collect new data?’.
Artificial intelligence (or AI) is an enormously hot topic, regularly hitting the news with the latest milestone where computers matching or exceeding the capacity of humans at a particular task. For ecologists, one of the most exciting and promising uses of artificial intelligence is the automatic identification of species. If this could be reliably cracked, the streams of real-time species distribution data that could be unlocked worldwide would be phenomenal.
Despite the hype and rapid improvements, we’re not quite there yet. Although AI naturalists have had some successes, they can also often make basic mistakes. But we shouldn’t be too harsh on the computers, since identifying the correct species just from a picture can be really hard. Ask an experienced naturalist and they’ll often need to know where and when the photo was taken. This information can be crucial for ruling out alternatives. There’s a reason why field guides include range maps!
Currently, most AI identification tools only use an image. So, we set out to see if a computer can be taught to think more like a human, and make use of this extra information. Continue reading →
We have now entered the era of artificial intelligence. In just a few years, the number of applications using AI has grown tremendously, from self-driving cars to recommendations from your favourite streaming provider. Almost every major research field is now using AI. Behind all this, there is one constant: the reliance, in one way or another, on deep learning. Thanks to its power and flexibility, this new subset of AI approach is now everywhere, even in ecology we show in ‘Applications for deep learning in ecology’.
But what is deep learning exactly? What makes it so special?
Deep Learning: The Basics
Deep learning is a set of methods based on representation learning: a way for machines to automatically detect how to classify data from raw examples. This means they can detect features in data by themselves, without any prior knowledge of the system. While some models can learn without any supervision (i.e. they can learn to detect and classify objects without knowing anything about them) so far these models are outperformed by supervised models. Supervised models require labelled data to train. So, if we want the model to detect cars in pictures, it will need examples with cars in them to learn to recognise them.
A warning:Halloween is nigh, and the following post contains graphic real-life imagery of maggot-eaten eye-sockets and deadly pianos. Read on… if you dare!
A Death in the Woods
In the vast and often frozen boreal forest of northern Canada there is a slow-burning forensic investigation into a death. The victim: a woodland caribou, an iconic species that is threatened or endangered throughout its range.
The scene is very much made for TV neo-Scandinavian neo-noir. From a not-too-luxurious regional office in the town of Fort Smith, just north of the Alberta border, over a steaming cup of coffee, world-weary biologist Allicia Kelly – who’s seen it all and then some – is monitoring the movements of collared animals on her computer screen. It’s the middle of May. The females, nearly all pregnant, are scattering to higher ground to find suitably cozy and secluded sites to calve. All is as peaceful and idyllic as a bunch of blips on a computer screen can be.
But then (cue slightly unsettling dissonance in the soundtrack) one of the little blips seems to have stopped moving. Kelly raises her eyebrow, tells herself to keep an eye out. A moment later she makes the call: “Team, we’ve got another ringer … let’s roll!” Continue reading →