Post provided by Cyril Milleret
Spatial Capture-Recapture and Computation Time
The estimation of population size is one of the primary goals and challenges in wildlife ecology. Within the last decade and a half, a new class of tools has emerged, allowing us to estimate abundance and other key population parameters in specific areas. So-called spatial capture-recapture (SCR) models are growing in popularity not only because they can map abundance, but also because they can be fitted to data collected from a variety of monitoring methods. For example, the ever increasing use of non-invasive monitoring methods, such as camera trapping and non-invasive genetic-sampling, is one of the reason that makes SCR models so popular.
One other strengths of SCR models is the ability to make population level inferences. But the wider the region you’re monitoring, the greater the computational burden, challenging the use of such methods at really large scale.
The Number of Detectors is the Main Issue
In SCR models, individual detections generally occur at fixed points in space called detectors or traps. The basic concept of SCR models is to use the information contained in the spatial pattern of individual detections/non-detections to estimate the location of their activity centers. To do so, an observation model, which models the detection probability of an individual as a function of the distance to its activity center, has to be estimated.
In terms of computation, it means that a separate detection probability has to be estimated for each individual and for each detector. Computation time therefore increases exponentially with the number of detectors. A simple way to reduce the number of detectors (and the number of calculations) is to spatially aggregate individual detections. Aggregating data like this will increase its coarseness. With increasing coarseness of the aggregation your computation time decreases, but so does the risk of faulty inference.
The study that led to our paper ‘Using partial aggregation in spatial capture recapture’ was motivated by our own need for increased computational efficiency of SCR models in project RovQuant. RovQuant aims to estimate the density of several large carnivores (wolverines, brown bears and wolves) over a large spatial extent (Norway and Sweden) using non-invasive genetic sampling (Rovbase) and SCR models.
Spatial Aggregation of Detections for Non-Invasive Genetic Sampling
While aggregating detectors result in a loss of information and might be seen as a waste of field effort, it’s very convenient for data from search-encounter sampling types. Let’s take the example of the monitoring program of wolverines in Norway. Observers from the Norwegian State Nature Inspectorate use snowmobile or cross-country skis to search for samples (mainly fecal samples) from wolverines. Using microsatellite genotyping, they are able to identify individuals.
In this example, detections of individuals occur continuously in space. As an alternative to modeling the continuous space search process, it’s convenient to define ‘pseudo-detectors’. To do this, we draw a grid over the area that has been searched. We then use the centers of grid cells as our ‘detectors’ and associate each detection to the closest grid cell center (i.e. spatial aggregation of detections to grid cell centers). The size of the cell – the distance between detectors – will dictate the number of detectors and to a large extent computing time.
Our simulation results revealed that spatially aggregating detections leads to biased parameter estimates, but also a huge reduction in computation time. Bias is a particular problem for the parameters that describe the relationship between detection probability and the distance between a detector and an individual’s (unobserved) activity center. Abundance estimates tended less affected by aggregation though.
But How Should You Aggregate?
One important decision when aggregating is how to summarize your detections. Should we count the number of detections occurring within these detector cells or only consider the presence/absence of detections at detectors? Not considering the number of detections may again result in a loss of information. However, the process of local fecal deposition (or other accumulation of samples) might not always be the outcome of space use. It could easily be due to more complex behaviors (e.g. scent-marking). It’s generally better to use presence/absence data as it avoids having to specify a model for non-independence of detection events.
In ‘Using partial aggregation in spatial capture recapture’, we developed a solution that allows the researcher to retain the advantages of the binary observation model while keeping a maximum amount of the original information when performing spatial aggregation – partial aggregation of binary data (PAB). The PAB observation model divides detectors into a set of sub-detectors and models the frequency of sub-detectors with more than one detection as a binomial response. This allows retaining more of the original information during aggregation, which improves our inferences.
What We Learned
Strong spatial aggregation of detection causes bias in parameter estimates. But this bias can be kept to a reasonable level (<10%) if spatial aggregation is kept at a reasonable level. Doing this significantly reduces computing time without having large negative effects on our inferences. If you’re aggregating binary detections, we’d strongly suggest that you use our PAB model. It retains more information and produces estimates with lower bias than simple aggregation of binary data.
To find out more, read the full Methods in Ecology and Evolution article ‘Using partial aggregation in Spatial Capture Recapture’