**The following is a press release about the Methods paper ‘**

**An examination of index-calibration experiments: counting tigers at macroecological scales‘ taken from the University of Oxford News and Events page:**

Flaws in a method commonly used in censuses of tigers and other rare wildlife put the accuracy of such surveys in doubt, a new study suggests.

A team of scientists from the University of Oxford, Indian Statistical Institute, and Wildlife Conservation Society exposes, for the first time, inherent shortcomings in the ‘index-calibration’ method that means it can produce inaccurate results. Amongst recent studies thought to be based on this method is India’s national tiger survey (January 2015) which claimed a surprising but welcome 30 percent rise in tiger numbers in just four years.

The team urges conservation practitioners to guard against these sources of error, which could mislead even the best conservation efforts, and suggests a constructive way forward using alternative methods of counting rare animals that avoid the pitfalls of the index-calibration approach.

A report of the research is published this week in the journal *Methods in Ecology and Evolution*.

Index-calibration often relies on measuring animal numbers accurately in a relatively small region using reliable, intensive and expensive methods (such as camera trapping) and then relating this measure to a more easily obtained, inexpensive indicator (such as animal track counts) by means of calibration. The calibrated-index is then used to extrapolate actual animal numbers over larger regions.

This approach has been popular among wildlife conservation agencies to generate animal numbers at a regional and national level. These numbers are then used to inform conservation efforts and direct resources worth millions of pounds.

To investigate index-calibration the team created a mathematical model describing the approach and then tested its efficiency when different values, representing variations in data, were inputted. Under most conditions the model was shown to lose its efficiency and power to predict. The team then tested this mathematical model on a real world example: attempting to derive tiger numbers from fieldwork data. The index-calibration model was shown to be unreliable again, with any high degree of success shown to be down to chance, rather like being dealt a single incredibly ‘high value’ poker hand, that could not be replicated.

Arjun Gopalaswamy, lead author of the report from the Wildlife Conservation Research Unit at Oxford University’s Department of Zoology, said: ‘Our study shows that index-calibration models are so fragile that even a 10 percent uncertainty in detection rates severely compromises what we can reliably infer from them. Our empirical test with data from Indian tiger survey efforts proved that such calibrations yield irreproducible and inaccurate results.’

Arjun added: ‘Index-calibration relies on the assumption that detection rates of animal evidence are high and unvarying. In reality this is nearly impossible to achieve. Instead, there are many flexible approaches, developed over the past decade by statistical ecologists, which can cut through noisy ‘real world’ data to make accurate predictions.’

Dr Ullas Karanth, a co-author from the Wildlife Conservation Society, and a member of India’s National Tiger Conservation Authority, said: ‘This study exposes fundamental statistical weaknesses in the sampling, calibration and extrapolations that are at the core of methodology used by the Government to estimate India’s numbers, thus undermining their reliability. We are not at all disputing that tigers numbers have increased in many locations in India in last 8 years, but the method employed to measure this increase is not sufficiently robust or accurate to measure changes at regional and country wide levels.’

Professor Mohan Delampady, a co-author from the Indian Statistical Institute, said: ‘The findings have wider consequences for several applied sciences where sampling and direct extrapolation is involved, especially when sampling errors are influenced by unknown detection probabilities.’

Professor David Macdonald, a co-author and the founding Director of the Wildlife Conservation Research Unit at Oxford University’s Department of Zoology, said: ‘This is a breakthrough which will dramatically change how we count wildlife numbers in the future.’ He added: ‘Index-calibration can work well, if the correlations are tight and consistent, but often they aren’t, and many of us, myself included, for example in the context of estimating numbers of mink and water voles in the UK, have been using the technique without appreciating its risks. Our intention is to help conservationists by highlighting the conditions when index calibration can be misleading. Everybody will benefit from greater accuracy when it comes to counting rare animals.’

The team say that the aim of the study is to help ecologists and conservationists to address the global challenge of counting rare and elusive animals. The good news is that the mathematical model created by the team provides the crucial ‘link’ between some of the older methods (which don’t estimate detection rates) with some of the newer methods (which do estimate detection rates). The findings will help in the reanalysis of raw data from wildlife research. The study also recommends that estimates from future surveys will be most reliable if designed, a priori, keeping in mind the power of modern, robust, modelling approaches.

**Please see the original Press Release HERE for media contact details etc.**

Click HERE to access the full article (subscription to *Methods in Ecology and Evolution* or BES Membership required).

Neither this press release nor the paper bears critical examination. A full dissection of the paper would be a paper in itself; I flag just a few issues. The result for the ‘lower bound’ expected value of R-squared when p is variable cannot be correct, as it implies a homeopathic effect of variation in p: the effect increases as variation gets smaller and smaller. The language of ‘lower’ and ‘upper’ bounds is misleading – these are bounds on the expected (‘average’) value: actual R-squared may vary widely. Spurious explanations are adduced for spurious effects, such as the supposed inverse density-dependence of R-squared. The application to tigers is garbled: the authors equate the probability of detecting occupancy with probability of detecting an individual, and equate sampling error CV with individual variation CV. No particular conclusions may be drawn. Index calibration (also known as ‘double sampling’) remains a valid method. How well it works in a particular case depends on study design. Alternatives for achieving the same goal, only alluded to by the authors, are likely subject to many of the same constraints (a point acknowledged in the paper). The extravagant language of the paper is unseemly and unwarranted.

Dear Dr. Efford:

Sorry about the late reply, I was away on fieldwork. Many thanks for taking a closer look at the paper, and for your helpful comments. I believe many of them will further strengthen some conclusions. Yes, we develop models and derive the theoretical R^2 in this study and put theoretical bounds on it with defined assumptions. True, the estimated R^2 for a data set can be different from the theoretical value. It will then lead to more questions, such as, what will such a deviation mean and how reliable are such estimates of R^2. The primary derivations of the R^2 are for individual detection probabilities. These can be used directly if information is available from data sets. We additionally describe one possible way of mapping from site-specific detection probabilities by conditioning on k, the sampling occasions, when individual detection probabilities are not available directly. Since, the term “mu”(mean abundance/density) appears in the denominator of the R^2 statistic for p* variable case, it appears as inverse density-dependence – though subtle in relative terms. In the tiger example, the mapping to individual detection probabilities is done “conservatively” (highest value of p*). Any lower, and the value of R^2 will drop further – and may be a useful exploration at a later time. Sampling error CV was not used, but the CV available from single tiger occupancy survey was used to provide an example of the sort of variation in the landscape available. This will most likely be higher than what was conservatively assumed in the study. Thank you for noticing some potential lower bound-related technical issues. This will further strengthen conclusions about the tiger studies, though it was all based on the upper bounds mainly. Yes, R^2-based direct index-calibration methods to estimate abundance at large scales is a valid method only when detection probability is very high and unvarying. For surveys of large carnivores, especially in the tropics, method is likely to be highly unreliable, because variances are likely to be too high owing to low detection probabilities and changes in abundances cannot be detected. Alternative methods (a combination of statistical and field methods) that will permit specific modeling of on-the-ground covariates on an estimable detection probability parameter along with the main parameter of interest (say occupancy or abundance), is preferred.

Regards

Arjun

Surprising to see such high tiger density in Maharashtra…