We want your input. Because we#ve got a good idea, and we have some ideas about how to continue it, but we’re not sure how to start.
Our great idea was to unashamedly do something like the Statistical Notes series that Martin Bland and Doug Altman have been writing for the BMJ for many years. They explain statistical concepts that are important for medics to understand. The style is informal and short, a bit like a blog post, really.
Our plan would be to largely follow this model, but not necessarily stick to statistics: anything relevant for ecology and evolutionary biology would be fair game. Although initially Rob and I are planning to write these pieces, we’ll open it up for other people to write too. The other difference is that we’ll put drafts here on the blog, so we can get feedback – see what works, what people have trouble with, etc.
So, now here’s our problem. We have ideas for topics to address, but they are all a bit too specialised (e.g. the logit link), so although they’ll be good topics, they’re not a good way of starting. We want to begin with a bang, with something important which will catch people’s attention. But we need some help in picking a topic.
As you might have guessed, this post is to ask for suggestions of what topics to cover. We especially want help with deciding how to start, but any suggestion of what you want to know about will be gratefully received.
Hi guys,
This series sounds like a really useful contribution.
In terms of big ideas to get started – would something like “what is pseudoreplication, and how can we avoid it?” be too broad a topic to take on?. I know it’s a well understood problem among ecologists, but there is something to be said for clear and simple exposition.
There is how to avoid pseudo-replication (e.g., in data collection), and then there is how to account for it (e.g., in data analysis). The latter might lead to hierarchical models, etc.
Something on what hierarchical models are, and what they can do for you might work?
Such a series will be very useful!
I have been toying with a manuscript for Methods that discusses the issues of “random” sampling and sub sampling for bats based n acoustic surveys.
After > 15 years of examining data using ALL data acoustic files to derive relative abundance and relative activity in bats I am convinced many consultants doing wind power surveys are not adequately representing what is really going on by only sub sampling ± 5 or 10 minutes of each hour.
This will and does miss the rare species that are exactly what monitoring programs for wind energy are set up to do.
My key is I am not sure how to develop the “Sub sampling” routine of full data sets to robu8stly test this.
Any potential co-authors/collaborators interested in assisting with this?
This is an issue that has not been addressed to date int eh literature.
Cheers
Thanks for your comments. Both hierarchical models and pseudo-replication are worth thinking about.
Bruce – from what you describe, I don’t think it would be too difficult, but it depends on the exact nature of your data. You could try contacting a statistician, and either contracting the work or seeing if they have a student who could take a look.
If the logit link is too specialized, perhaps a nice little primer on the general linear model would be a good start.
Looking forward to see this series underway!
He, great idea on the ‘statistical notes’ series. One suggestion is something about model selection for different purposes. In particular prediction versus explanation. As old a chestnut as it might be, model selection in papers often follow the latest trends rather than really considering what the purpose of the model is and what might be the best method. For example models which will be used to make predictions might be best selected using AIC because in theory it should indicate models which make predictions which best match the ‘missed out’ data, but in practice it doesn’t usually asess the ability of models to predict to new data, and predicting to the data used to build the model is entirely different from predicting to new data.
Given that many of the papers published in Methods deal with models that will be used to make predictions, this might be a really good starting topic?
That might be a good idea, if we can avoid making it controversial.
I’ve been meaning to write a rant about AIC (on my own blog, not here) which would include some of these topics, so perhaps I should write it, and then see if anything can be used.
I would love to read you rant on AIC- what is your blog url?
Bruce – it’s http://occamstypewriter.org/boboh. I only moved my blog there a couple of days ago, and I haven’t had time to move my back catalogue of posts, so it’s a bit bare at the moment.