Doctors aren't fools. In 1954, when the research was published in the doctors' own professional magazine, the British Medical Journal, they could draw their own conclusions. Hill quit smoking that year, and many of his fellow doctors quit, too. Doctors became the first identifiable social group in the UK to give up smoking in large numbers.
In 1954, then, two visions of statistics had emerged at the same time. To the many readers of Darrell Huff's How to Lie with Statistics, statistics were a game, full of swindlers and cheats—and it could be amusing to catch the scoundrels at their tricks. But for Austin Bradford Hill and Richard Doll, statistics were no laughing matter.
Their game had the highest imaginable stakes, and if it was played honestly and well, it would save lives.
By the spring of 2020—as I was putting the finishing touches to this book—the high stakes involved in rigorous, timely, and honest statistics had suddenly become all too clear. A new coronavirus was sweeping the world. Politicians had to make the most consequential decisions in decades, and fast. Many of those decisions depended on data detective work that epidemiologists, medical statisticians, and economists were scrambling to conduct. Tens of millions of lives were potentially at risk. So were billions of people's livelihoods.
As I write these words, it is early April 2020: countries around the world are a couple of weeks into lockdowns, global deaths have just passed 60,000, and it is far from clear how the story will unfold. Perhaps, by the time this book is in your hands, we will be mired in the deepest economic depression since the 1930s and the death toll will have mushroomed. Perhaps, by human ingenuity or good fortune, such apocalyptic fears will have faded into memory. Many scenarios seem plausible. And that's the problem.
An epidemiologist, John Ioannidis, wrote in mid-March that COVID-19 may be "a once-in-a-century evidence fiasco." The data detectives are doing their best—but they're having to work with data that are patchy, inconsistent, and woefully inadequate for making life-and-death decisions with the confidence we'd like.
Details of the fiasco will, no doubt, be studied for years to come. But some things already seem clear. At the beginning of the crisis, for example, politics seem to have impeded the free flow of honest statistics—a problem we'll return to in the eighth chapter. Taiwan has complained that in late December 2019 it had given important clues about human-to-human transmission to the World Health Organization—but as late as mid-January, the WHO was reassuringly tweeting that China had found no evidence of human-to-human transmission. (Taiwan is not a member of the WHO, because China claims sovereignty over the territory and demands that it should not be treated as an independent state. It's possible that this geopolitical obstacle led to the alleged delay.)
Did this matter? Almost certainly; with cases doubling every two or three days, we will never know what might have been different with an extra couple of weeks of warning. It's clear that many leaders took their time before appreciating the potential gravity of the threat. President Trump, for instance, announced in late February, "It's going to disappear. One day, it's like a miracle, it will disappear." Four weeks later, with 1,300 Americans dead and more confirmed cases in the United States than any other country, Mr. Trump was still talking hopefully about getting everybody to church at Easter.
As I write, debates are raging. Can rapid testing, isolation, and contact tracing contain outbreaks indefinitely, or only delay their spread? Should we worry more about small indoor gatherings or large outdoor gatherings? Does closing schools help prevent the spread of the virus, or do more harm as children go to stay with vulnerable grandparents? How much does wearing masks help? These and many other questions can be answered only by good data on who has been infected, and when.
But a vast number of infections were not being registered in official statistics, due to a lack of tests—and the tests that were being conducted were giving a skewed picture, being focused on medical staff, critically ill patients, and—let's face it—rich, famous people. At the time of writing, the data simply can't yet tell us how many mild or asymptomatic cases there are—and hence how deadly the virus really is. As the death toll rose exponentially in March—doubling every two days—there was no time to wait and see. Leaders put economies into an induced coma—more than three million Americans filed jobless claims in a single week in late March, five times the previous record. The following week was even worse: another six and a half million claims were filed. Were the potential health consequences really catastrophic enough to justify sweeping away so many people's incomes? It seemed so—but epidemiologists could only make their best guesses with very limited information.
It's hard to imagine a more extraordinary illustration of how much we usually take accurate, systematically gathered numbers for granted. The statistics for a huge range of important issues that predate the coronavirus have been painstakingly assembled over the years by diligent statisticians, and often made available to download, free of charge, anywhere in the world. Yet we are spoiled by such luxury, casually dismissing "lies, damned lies, and statistics." The case of COVID-19 reminds us how desperate the situation can become when the statistics simply aren't there.