Math in the age of COVID-19: the what and the why of mathematical modeling.

Updated: Sep 23, 2020

By: Ravi Ranjan

Every morning I get up, make coffee, log in to my computer, and pull up the daily news digest of the New York Times. This routine now includes a consistent COVID-19 update of total new cases, the daily death count, and glum future projections of how much worse it could get.

World map showing COVID cases.
World Health Organization data as of September 20th, 2020

Often, specific mathematical models are cited as the source for these numbers. But one might wonder what these mathematical models are and what do they do? After all, how do we know that there will be between 207,000 to 218,000 deaths due to COVID-19 by October 10, as the 40 models used by the Centers for Disease Control (CDC) uses for their forecasts? A prediction by the Institute for Health Metrics and Evaluation at the University of Washington in March said there would be 162,000 deaths from coronavirus by August 4th; the death toll ended up being 156,000 on that date, showing this particular model was reasonably accurate.

Was this just wishful or extremist thinking? Or is there real evidence backing these predictions? Why is math even in the picture at all when tallying infections and death counts? After all, diseases are typically associated with doctors, hospitals, and drugs. So why is it that during a pandemic, we are seeing widespread use of math?

Line graphs predicting 5,000 weekly deaths from COVID over the next month
CDC’s national predictions for COVID-19 deaths as of September 14th, 2020

Contrary to popular perceptions, biology and math are inextricably linked. This is because biological systems are complex, and math offers us a way to articulate and analyze that complexity. At the surface-level, COVID-19 seems like a simple epidemic to control; simply stop the pathogen (SARS-CoV-2) from spreading. As the virus spreads by contact, eliminating social contact should contain its spread.

However, it is not that simple. Biological systems contain interacting parts that become harder to describe and predict as the complexity of the system increases. For example, the probability of getting infected with SARS-CoV-2 depends on an individual’s immune system and the time of contact with an infected person (the longer the contact, the higher the viral load, and the more likely one is to get infected). Once infected, the virus can then affect individuals differently—a sliding scale from asymptomatic (no symptoms) to severely debilitating illness (on a ventilator). Furthermore, some infected individuals are suspected to be super-spreaders of the virus, while others may not spread the virus as much.

We wrap our head around these complex and highly variable narratives using verbal reasoning in the form of if-then statements. Such logic operations begin to imply the use of probability; the chance of any given outcome in relation to a given/known scenario. Therefore, mathematical models are essential in evaluating the potential outcomes of all of these nuanced situations.

Mathematical models quantify these processes and their resulting interactions by stripping away unnecessary detail from a problem and focusing on the aspects of interest. It is crucial to understand that mathematical models are models; therefore, by definition, they are idealized caricatures of reality. Perhaps the most useful example in this context is a map. A map is a model of the world that hides extraneous details of the world (like the location of every tree) and focuses on the useful aspects (such as roads and gas stations). Depending on the purpose of the map, the scale (within a city vs. across the country) and the focus of the map (topography vs. roads) can be completely different. While the most realistic map of the world is the real-world itself (and everything within it), that map would be fairly useless if you were trying to get specific directions from East Lansing to Kalamazoo.

In the context of epidemics, the simplest models (known as the Susceptible-Infected-Resistant (SIR) models) assume that the population is infinitely large in size, is spread uniformly across space, and all individuals in a population are the same except for the presence or absence of infection. The models then divide the population into three categories – susceptible, infected, and recovered individuals.

To understand how disease spreads through a population, these models then focus on these categories and see how they change in time under different conditions. This is of course far removed from reality where population sizes are finite, where locations of epidemics are often clustered in specific communities, and where individuals can vary due to age, sex, and other health conditions. However, while there are other models which incorporate these additional factors, it is important to realize that the SIR model is quite deliberately simplistic.

The simplicity of the SIR model allows it to glean insights which are general in nature and would apply to any disease which results in