Poll: have sample sizes increased over time in ecology?

One very basic way to improve any empirical study is to increase the sample size. All else being equal, larger samples are better. They lead to more precise (i.e. repeatable) estimates of population parameters. And we’d all like to think, or hope, that scientific studies get better over time. That we’re doing better ecology now than we were, say, 50 years ago (but see).

So here’s my question for you: have sample sizes increased over time in ecology?

I know the answer. I’ve already crunched the numbers, using data from Costello and Fox (2022). But before I tell you the answer, I want to know what you think the answer’s going to be!

To help you out a bit, I’ll make the question more specific. “Sample size” means “sample size of any ecological study reporting a correlation coefficient, that was later used in a meta-analysis.” “Over time” means “from 1945-2020”. “Increased” means “a positive association between sample size and publication year, that could possibly reflect a real trend rather than just sampling error.” That is, weak, noisy, and/or nonlinear positive associations count, unless the positive association is so weak and noisy that it’s clearly just sampling error.

Given the lengthy timespan of this dataset, you can certainly imagine reasons why sample sizes might have increased. A lot has changed in ecological research since 1945! For instance, this dataset goes back to before the time when national governments started funding scientific research in a large-scale, systematic way. This dataset goes back to a time when there were many fewer ecologists, they mostly didn’t collaborate with one another, and they didn’t publish all that often. This dataset goes back to a time when technologies like PCR and personal computers didn’t exist. Etc. On the other hand, I’m sure you can also imagine reasons why ecological sample sizes might not have changed, despite all those changes in other aspects of ecological research. Your job is to decide which bits of your imagination you believe. 🙂

No cheating and downloading the dataset from Dryad to figure out the answer for yourself before you take the poll! Although it is not cheating if you can recall the related post I wrote a few years ago, and use your memory of that old post to inform your guess… 🙂

Without wanting to give the answer away, please take my word that the answer is very clear-cut. As you’d expect it would be, given that the dataset comprises over 16,000 correlation coefficients, their sample sizes, and their publication dates. We have plenty of data to answer this question!

