Vote Counting Is Dead (Except When It Isn’t): A Zombie Method in Ecology

1 The persistence of a method we all know is wrong

Vote counting is one of those methods that most evidence synthesis guidelines discourage and yet it keeps reappearing in ecological and conservation reviews.

Sometimes it appears explicitly (“X studies found a positive effect, Y found none”).
Sometimes implicitly (“most studies suggest…”).
Sometimes visually, through heatmaps or bar charts that invite the reader to mentally tally directions or significance.

This persistence is not because reviewers are careless. It is because vote counting feels intuitive under uncertainty especially when data are heterogeneous, sample sizes are small, and formal meta-analysis feels overconfident.

In this post, I show why vote counting is especially misleading under the conditions that dominate ecological evidence bases, and why different forms of vote counting can lead to mutually contradictory conclusions from the same data.

2 What do we mean by “vote counting”?

Although it comes in different guises, vote counting usually boils down to one of three approaches.

2.1 Direction-based vote counting

“Most studies report a positive effect.”

Each study contributes one vote based on the sign of its estimated effect.

2.2 Significance-based vote counting

“Only X studies found a statistically significant effect.”

Each study contributes one vote based on whether p < 0.05.

2.3 Visual vote counting

Forest plots, heatmaps, or summary figures are presented with minimal synthesis, inviting the reader to visually tally directions or “significant” results.

These approaches differ in presentation, but they share a critical assumption:

Each study contributes one equal vote, regardless of precision, power, or context.

3 Why vote counting feels reasonable in ecology

Vote counting is often defended (explicitly or implicitly) because:

  • Effect sizes are hard to compare across systems
  • Sample sizes vary wildly
  • Heterogeneity is extreme
  • Review authors want to avoid “over-interpretation”

In highly heterogeneous ecological data, vote counting can feel like the conservative choice.

Unfortunately, it is usually the least informative one.

4 A simple simulation: the conditions where vote counting thrives

To make this concrete, I simulate a small ecological evidence base with:

  • A real, non-zero effect
  • High heterogeneity across studies
  • Low and variable power
Simulation of heterogeneous, low-powered studies
library(tidyverse)
library(metafor)

set.seed(2)

simulate_studies <- function(
  k = 30,
  mu = 0.3,
  tau = 0.4,
  vi_range = c(0.05, 0.30)
) {
  vi <- runif(k, vi_range[1], vi_range[2])
  theta <- rnorm(k, mu, tau)
  yi <- rnorm(k, theta, sqrt(vi))

  tibble(
    study = seq_len(k),
    yi = yi,
    vi = vi,
    se = sqrt(vi),
    z = yi / se,
    p = 2 * pnorm(-abs(z)),
    direction = if_else(yi > 0, "positive", "negative"),
    significant = p < 0.05
  )
}

dat <- simulate_studies()

5 What direction-based vote counting concludes

Code
dat |> count(direction)
# A tibble: 2 × 2
  direction     n
  <chr>     <int>
1 negative     12
2 positive     18

A clear majority of studies estimate a positive effect.

A common narrative conclusion would be:

“Most studies suggest a positive effect, although results are mixed.”

This sounds cautious and reasonable.

6 What significance-based vote counting concludes

Code
dat |> count(significant)
# A tibble: 2 × 2
  significant     n
  <lgl>       <int>
1 FALSE          22
2 TRUE            8

Now we see something very different: only a small minority of studies are statistically significant.

A common narrative conclusion here would be:

“Most studies found no significant effect.”

7 Two conclusions from the same data

These two conclusions are:

  • Based on the same evidence
  • Both commonly used in ecological reviews
  • Logically incompatible

The contradiction arises because vote counting answers the wrong question:

How many studies “found” an effect?

instead of:

What does the evidence imply about the distribution of effects?

8 The meta-analytic reality check

Now we fit a standard random-effects meta-analysis to the same data.

Random-effects meta-analysis
m <- rma(yi, vi, data = dat, method = "REML")
predict(m)

   pred     se   ci.lb  ci.ub   pi.lb  pi.ub 
 0.2334 0.1209 -0.0036 0.4704 -0.8263 1.2931 

The result is:

  • A non-zero pooled mean
  • A wide prediction interval

In other words:

The effect exists, but its direction and magnitude vary strongly by context.

Vote counting cannot express this.

9 Why vote counting fails under ecological conditions

Vote counting does not merely discard information, it systematically distorts it when power is low and heterogeneity is high.

Mechanistically:

  • Low power => non-significant != no effect
  • Equal votes ignore precision
  • Direction ignores magnitude
  • Significance thresholds create false dichotomies
  • Heterogeneity is collapsed into counts

Under these conditions, vote counting is biased against detecting real but variable effects.

10 When is vote counting defensible?

There are limited cases where vote counting may be acceptable:

  • Purely scoping exercises
  • Binary outcomes with very similar designs
  • Explicitly descriptive summaries with no inferential claims

Even then, it should never be used to infer:

  • Strength of effects
  • Generality
  • Expected outcomes in new contexts

11 Better alternatives

Depending on the question, better options include:

  • Random-effects meta-analysis with prediction intervals
  • Magnitude-based synthesis
  • Meta-regression focused on context, not averages
  • Decision-analytic framing

These approaches do not eliminate uncertainty, they represent it honestly.

12 A zombie idea worth retiring

Vote counting persists not because it works, but because it feels intuitive under uncertainty. In ecology, that intuition is usually misleading.

The uncomfortable truth is this:

When evidence is heterogeneous and underpowered, counting studies is one of the worst ways to summarise it.