We suggest a statistical test for underdispersion in the reported Covid-19 case and death numbers, compared to the variance expected under the Poisson distribution. Screening all countries in the World Health Organization (WHO) dataset for evidence of underdispersion yields 21 country with statistically significant underdispersion. Most of the countries in this list are known, based on the excess mortality data, to strongly undercount Covid deaths. We argue that Poisson underdispersion provides a simple and useful test to detect reporting anomalies and highlight unreliable data.
Irregular statistical variation has proven a powerful forensic tool for detecting possible fraud in academic research, accounting statements and election tallies. Now similar techniques are helping to find a new subgenre of faked numbers: covid-19 death tolls.
That is the conclusion of a new study to be published in Significance, a statistics magazine, by the researcher Dmitry Kobak. Mr Kobak has a penchant for such studies—he previously demonstrated fraud in Russian elections based on anomalous tallies from polling stations. His latest study examines how reported death tolls vary over time. He finds that this variance is suspiciously low in a clutch of countries—almost exclusively those without a functioning democracy or a free press.
Mr Kobak uses a test based on the “Poisson distribution”. This is named after a French statistician who first noticed that when modelling certain kinds of counts, such as the number of people who enter a railway station in an hour, the distribution takes on a specific shape with one mathematically pleasing property: the mean of the distribution is equal to its variance.
This idea can be useful in modelling the number of covid deaths, but requires one extension. Unlike a typical Poisson process, the number of people who die of covid can be correlated from one day to the next—superspreader events, for example, lead to spikes in deaths. As a result, the distribution of deaths should be what statisticians call “overdispersed”—the variance should be greater than the mean. Jonas Schöley, a demographer not involved with Mr Kobak’s research, says he has never in his career encountered death tallies that would fail this test.
The Russian numbers offer an example of abnormal neatness. In August 2021 daily death tallies went no lower than 746 and no higher than 799. Russia’s invariant numbers continued into the first week of September, ranging from 792 to 799. A back-of-the-envelope calculation shows that such a low-variation week would occur by chance once every 2,747 years.