Stuck between Zero and One: Modelling Non-Count Proportions with Beta and Dirichlet Regression

Post provided by JAMES WEEDON & BOB DOUMA

Chinese translation provided by Zishen Wang

這篇博客文章也有中文版

Proportion of leaf damage is a type of measurement that can lead to proportional data.

Imagine the scene: you’re presenting your exciting research results at an important international conference. Being conscientious and aware of statistical best-practice and so you’ve included test statistics and confidence intervals on all your result figures. Not just P values! Some of the data you are presenting involves the proportion of leaf surface damaged by an insect herbivore under different treatments. You finish your presentation (on time!) and there’s time for questions. From the audience a polite but insistent colleague asks: “Your confidence interval for that estimate goes from -0.3 to 0.5… how should we interpret a negative proportion of a leaf?”.

Someone chuckles. As you nervously flick back to the slide in question, you mutter something about the difference between confidence intervals and point estimates. You start to feel dizzy. A murmur of confused voices slowly builds amongst the audience members. In the distance, a dog barks.

How can you avoid this?

Proportional Data in Ecology and Evolution

Many kinds of quantities that ecologists and evolutionary biologists routinely measure are most conveniently expressed as proportions. In many cases these proportions are derived from counts. The data are based on discrete entities that can be assigned to two or more classes: success or failure, male or female, invasive or non-invasive. In other cases the proportions are derived from continuous measurements: the proportion of time an animal spends on different activities;  percent cover of a plant functional type in a vegetation survey quadrat; allocation of total plant biomass to different organs and tissues. What these data types have in common is that they can only take values between zero and one. Negative values, or values greater than one, don’t make any sense. Continue reading

0与1的游戏:使用Beta和Dirichlet回归方法模拟非计数比例

海报作者:JAMES WEEDON & BOB DOUMA

中文翻译:Zishen Wang (王子申)

This post is also available in English

请设想一下这个场景:你正在一个重要的国际会议上汇报一个激动人心的成果。秉承一向对统计学理论和方法的严谨态度,你对所有的数据都做了统计学检验并给出了置信区间。这些统计分析结果并不只包含P值!你提供的一些数据涉及在不同处理下食草昆虫破坏的叶面积比例。当你准时完成报告时,一位同行问道:你对破坏比例估计的置信区间是-0.30.5,该怎么解释叶面积出现的负值呢?

观众席里有人笑了。你满脸通红地翻到被提问到的这张幻灯片,嘟囔着给大家解释置信区间和点估计之间的区别。观众们开始小声嘀咕,你好像听到不远处有一只狗在叫。

你该怎么避免这种尴尬又让大家疑惑的情况呢?

生态学和进化学中的比例数据

生态学家和进化生物学家会经常测定许多定量数据,为了方便展示,他们通常会把这些数据表示为比例。许多情况下,这些比例是由计数得来的。在一种情况下,这些比例数据是基于可划分为两个或者更多类别的离散实体的:成功或失败,男性或女性,侵入性或非侵入性。比例数据也可以针对连续型变量:动物进行不同活动的比例;植被调查样本中一种植物功能类型的百分比覆盖率植物生物量在各个器官和组织上的分配比例。这些比例数据的共同点是只能在0到1之间取值。小于0或大于1的值没有意义。

两种可以得到比例数据的测量:叶片损坏的比例和植被覆盖百分比。

两种可以得到比例数据的测量:叶片损坏的比例和植被覆盖百分比。

如果您使用常规统计工具来分析此类数据,可能会导致一些问题。线性回归,方差分析等方法假设因变量可以用正态分布建模。正态分布包含从负无穷大到正无穷大的值,因此不太适合模拟比例数据。用正态分布得出的预测值和置信区间很可能包含比例数据定义区间外的值。此外,残差与预测值有很强的相关性。这些现象都表明,选择错误的模型,会导致不准确的统计推断。 Continue reading

Mosquitoes, Climate Change and Disease Transmission: How the Suitability Index P Can Help Improve Public Health and Contribute to Education

Post Provided by JOSÉ LOURENÇO

Esta publicação no blogue também está disponível em português

©BARILLET-PORTAL David

©BARILLET-PORTAL David

Vector-borne viruses (like those transmitted by mosquitoes) are (re)emerging and they’re hurting local economies and public health. Some typical examples are the West Nile, Zika, dengue, chikungunya and yellow fever viruses. The eco-evolutionary and epidemiological histories of these viruses differ massively. But they share one important factor: their transmission potential is highly dependent on the underlying mosquito population dynamics.

An ultimate challenge in infectious disease control is to prevent the start of an outbreak or alter the course of an ongoing outbreak. To achieve this, understanding the ecological, demographic and epidemiological factors driving a pathogen’s transmission success is essential. Without this information, public health planning is immensely difficult. To get this information, dynamic mathematical models of pathogen transmission have been successfully applied since the mid-20th century (e.g. malaria and dengue). Continue reading

Mosquitos, o clima e a transmissão de patógenos: como o índice P pode contribuir para saúde pública e educação

PUBLICAÇÃO NO BLOGUE FORNECIDO POR JOSÉ LOURENÇO

This blog post is also available in English

©BARILLET-PORTAL David

©BARILLET-PORTAL David

Vírus transmitidos por vetores (ex. mosquitos, carraças) estão a (re)emergir e a ter consequências negativas para a saúde pública e para as economias locais. Exemplos típicos recentes de vírus transmitidos por mosquitos incluem o vírus West Nile na América do Norte, Israel e Europa, e os vírus Zika, dengue, chikungunya, Mayaro e febre amarela na América do Sul e África. A epidemiologia, ecologia, e evolução destes vírus são altamente diversas,  mas todos eles partilham um fator crítico: o seus potenciais de transmissão são altamente dependentes da dinâmica de população das espécies de mosquitos envolvidas.

Um dos objetivos principais do controlo de doenças infeciosas é prevenir o inicio (ou alterar o curso) de  epidemias. Para esse fim, modelos dinâmicos de transmissão têm sido usados com sucesso desde meados do século XX (ex. no contexto de malaria). Esses modelos são aproximações computacionais dos sistemas biológicos reais, permitindo simular uma multitude de cenários nos nossos computadores pessoais, e com tal testar, reconstruir e projetar o potencial e comportamento epidemiológico de patógenos. Quando tais simulações são comparadas com observações reais (ex. número de casos reportados por um sistema de vigilância), os modelos oferecem respostas sobre a mecânica de transmissão e os fatores epidemiológicos ou demográficos que terão contribuído para determinados padrões observados nos dados. Enquanto que modelos dinâmicos são uma das peças fundamentais da epidemiologia contemporânea, dados imperfeitos ou a falta deles pode tornar difícil (se não impossível) a conceção, implementação e utilidade esses modelos. As razões pelas quais dados podem ser imperfeitos são várias, desde sistemas de vigilância fracos, erros humanos, falta de investimento, etc. Continue reading

What Biases Could Your Sampling Methods Add to Your Data?

Post provided by ROGER HO LEE

這篇博客文章也有中文版

Have you ever gone fishing? If so, you may have had the experience of not catching any fish, while the person next to you got plenty. If you walked along the pier or bank, you may have seen that other fishermen and -women caught fish of various shapes and sizes. You’d soon realise that each person was using a different set of equipment and baits, and of course, that the anglers differed in their skills and experience. Beneath the water were many fish, but whether you could catch them, or which species could even be caught, all depended on your fishing method, as well as where and how the fish you were targeting lived.

Designing Sampling Protocols

Head view of different ant species found in Hong Kong and further in SE Asia.

Head view of different ant species found in Hong Kong and further in South East Asia.

This is a lot like the situation that ecologists often face when designing sampling protocols for field surveys. While a comprehensive survey will yield the most complete information, few of us have the resources to capture every member of the community we’re studying. So, we take representative samples instead. But the method(s) used for sampling will only allow us to collect a subset of the species which are present. This selection of the species is not random per se – it’s dependent on species’ life history. Continue reading

採樣方法會帶來怎樣的數據偏差?

作者:李灝

This blog post is also available in English

你有釣魚的經驗嗎?若有的話,以下的經歷對你應該不會陌生。自己釣了大半天,魚杆動也沒動過,但身旁的釣手卻滿載而歸。感到灰心時,你沿著碼頭或岸邊巡視,你看到其他人的魚獲大大小小的也有﹑形態不同的的也有。心裡被疑惑與不甘的思緒纏繞著的一刻,你突然意識到每個人都在使用不同的釣具和魚餌(當然每位垂釣者的技能和經驗也不同)。在水中有各種各樣的魚,但你能否釣到牠們,或者釣到那一些品種,都取決於你釣魚的方法,以及你目標魚種的活動範圍和生活方式。

採樣方案的設計

Head view of different ant species found in Hong Kong and further in SE Asia.

香港和東南亞地區的螞蟻品種。

上述的經歷與生態學家在設計野外調查時所遇到的情況非常相似。雖然全面的調查能取得最完整的資料,但我們很少會有充足的資源去完整地採集整個物種群落。取而代之的是我們只能採集一部份的物種來作寫照。值得我們留意的是每種採樣方法只允許我們收集到群落中的某些物種;這些物種不是隨機地被選中,而是取決於物種的生活史。 Continue reading

Field Work on a Shoestring: Using Consumer Technology as an Early Career Researcher

Post provided by CARLOS A. DE LA ROSA

Esta entrada de blog también está disponible en español

Champagne Tastes on a Beer Budget

Freshly outfitted with a VACAMS camera and GPS unit, #1691 heads off into the forest with her calf. ©Carlos A. de la Rosa

Freshly outfitted with a VACAMS camera and GPS unit, #1691 heads off into the forest with her calf. ©Carlos A. de la Rosa

There’s a frustrating yin and yang to biological research: motivated by curiosity and imagination, we often find ourselves instead defined by limitations. Some of these are fundamental human conditions. The spectrum of light detectable by human eyes, for example, means we can never see a flower the way a bee sees it. Others limitations, like funding and time, are realities of modern-day social and economic systems.

Early career researchers (ECRs) starting new projects and delving into new research systems must be especially creative to overcome the odds. Large grants can be transformative, giving a research group the equipment and resources to complete a study, but they’re tough to get. Inexperienced ECRs are at a disadvantage when competing against battle-hardened investigators with years of grant writing experience. Small grants of up to about $5000 USD, on the other hand, are comparatively easy to find. So, how can ECRs make the most of small, intermittent sources of funding?

I found myself faced with this question in the second year of my PhD field work. Continue reading

Trabajo de Campo a lo Barato: Uso de Tecnología de Productos de Consumo Para un Investigador al Inicio de su Carrera de Investigación

Contribución de CARLOS A. DE LA ROSA

This blog post is available in English

Gusto por champaña con presupuesto de cerveza

Recientemente equipada con una unidad de cámara y GPS VACAMS, la vaca No. 1691 se dirige al bosque con su becerro. ©Carlos A. de la Rosa

Recientemente equipada con una unidad de cámara y GPS VACAMS, la vaca No. 1691 se dirige al bosque con su becerro. ©Carlos A. de la Rosa

Hay un frustrante toma-y-dame en el campo de la investigación biológica: motivados por la curiosidad y la imaginación, a menudo nos encontramos definidos por limitaciones. Algunas de estas, como nuestros sentidos, son condiciones humanas fundamentales. El espectro de luz detectable por los ojos humanos, por ejemplo, significa que nunca podremos ver a una flor de la misma forma en que la ve una abeja. Otras limitaciones, como financiamiento y tiempo, representan las realidades de los sistemas sociales y económicos de hoy día.

Los investigadores al comienzo de sus carreras (Early Career Researchers, o ECRs en sus siglas en inglés) que se embarcan en nuevos proyectos y se involucran con sistemas nuevos de investigación deben ser especialmente creativos para poder superar las probabilidades. Una generosa beca puede ser transformativa, pero un ECR con poca experiencia está en desventaja cuando compite con investigadores ya endurecidos por la batalla, quienes tienen años de experiencia escribiendo propuestas de financiamiento. Por otra parte, las pequeñas becas en el rango de $2.000 a $5.000 son comparativamente fáciles de encontrar. ¿Cómo puede un ECR aprovechar al máximo estas pequeñas e intermitentes fuentes de financiamiento?

En el segundo año del trabajo de campo de mi doctorado me enfrenté con este enigma. Continue reading

Limitations and Benefits of the Unmatched Count Technique: Considering How We Use New Methods in Conservation

Post provided by Amy Hinsley and Ana Nuno

Esta publicação no blogue também está disponível em português

A New Conservation Toolbox

It is widely accepted that many conservation challenges are directly related to human behaviour. Whether it is the over-collection of a rare orchid by harvesters in Southeast Asia, or the decisions by collectors in Europe to buy and smuggle these orchids home, understanding the extent and nature of these behaviours is essential to addressing the threats they might cause. This has led conservation researchers and practitioners to start looking outside of their discipline, to find methods and approaches from across the social sciences to improve our understanding of these complex issues.

A research assistant carrying out a UCT survey about the use of Traditional Medicine products containing bear bile in China. © Chen Haochun.

A research assistant carrying out a UCT survey about the use of Traditional Medicine products containing bear bile in China. © Chen Haochun.

While this interdisciplinarity is a positive move for conservation, it is important that we treat these ‘new’ methods carefully and understand their limitations. If we don’t, there is a risk that our new toolbox full of exciting methods that sound great on a funding application, may in fact not be making what we do any better, or in extreme cases they may even be making it worse.

With this in mind, a group of conservation social scientists, led by researchers at the Universities of Oxford and Exeter, decided to look in depth into one of these ‘new’ methods, to provide recommendations on when and how it should be used, and when it shouldn’t. Our Open Access article – ‘Asking sensitive questions using the unmatched count technique: Applications and guidelines for conservation‘ – looks at the Unmatched Count Technique (UCT – also called the list experiment), which is increasingly being used in conservation to ask questions about ‘sensitive’ topics. Continue reading

Limitações e benefícios da técnica de contagem de itens: considerações sobre o uso de novos métodos em Conservação

publicação no blogue FORNECIDO POR AMY HINSLEY E ANA NUNO

This blog post is also available in English

Novas ferramentas de conservação

Muitos desafios em conservação estão diretamente relacionados com o comportamento humano. Quer seja a recolha excessiva de uma orquídea rara no Sudeste Asiático, ou a compra e contrabando dessas orquídeas por colecionadores na Europa, entender a magnitude e a natureza desses comportamentos é essencial para lidar com as ameaças que eles podem causar. Isso levou os investigadores e profissionais da área de conservação a começarem a olhar para fora da sua própria disciplina, de modo a encontrar métodos e abordagens das ciências sociais para melhorar a nossa compreensão sobre estas questões complexas.

Assistente de investigação a realizar um estudo recorrendo a TCI sobre o uso de produtos de medicina tradicional com bílis de urso na China. © Chen Haochun.

Assistente de investigação a realizar um estudo recorrendo a TCI sobre o uso de produtos de medicina tradicional com bílis de urso na China. © Chen Haochun.

Embora esta interdisciplinaridade seja um passo positivo para a conservação, é importante tratar esses “novos” métodos com cuidado e entender as suas limitações. Se não o fizermos, existe o risco da nossa nova caixa de ferramentas, repleta de métodos interessantes que soam bem em candidaturas a financiamento, na verdade não melhorar aquilo que nós geralmente já fazemos ou, em casos extremos, até piorá-lo.

Tendo isto em conta, um grupo de cientistas sociais em conservação, liderado por investigadores das Universidades de Oxford e Exeter, decidiu examinar em profundidade um desses “novos” métodos, fornecer recomendações sobre quando e como ele deveria ser usado, e quando não deveria. O artigo, disponível gratuitamente na revista científica Methods in Ecology and Evolution nesta semana, examina a Técnica de Contagem de Itens (TCI), que tem sido cada vez mais usada em conservação para fazer perguntas sobre tópicos “sensíveis”. Continue reading