By: Ravi Ranjan

Every morning I get up, make coffee, log in to my computer, and pull up the daily news digest of the New York Times. This routine now includes a consistent COVID-19 update of total new cases, the daily death count, and glum future projections of how much worse it could get.

Often, specific mathematical models are cited as the source for these numbers. But one might wonder what these mathematical models are and what do they do? After all, how do we know that there will be between 207,000 to 218,000 deaths due to COVID-19 by October 10, as the 40 models used by the Centers for Disease Control (CDC) uses for their forecasts? A prediction by the Institute for Health Metrics and Evaluation at the University of Washington in March said there would be 162,000 deaths from coronavirus by August 4th; the death toll ended up being 156,000 on that date, showing this particular model was reasonably accurate.

Was this just wishful or extremist thinking? Or is there real evidence backing these predictions? Why is math even in the picture at all when tallying infections and death counts? After all, diseases are typically associated with doctors, hospitals, and drugs. So why is it that during a pandemic, we are seeing widespread use of math?

Contrary to popular perceptions, biology and math are inextricably linked. This is because biological systems are complex, and math offers us a way to articulate and analyze that complexity. At the surface-level, COVID-19 seems like a simple epidemic to control; simply stop the pathogen (SARS-CoV-2) from spreading. As the virus spreads by contact, eliminating social contact should contain its spread.

However, it is not that simple. Biological systems contain interacting parts that become harder to describe and predict as the complexity of the system increases. For example, the probability of getting infected with SARS-CoV-2 depends on an individual’s immune system and the time of contact with an infected person (the longer the contact, the higher the viral load, and the more likely one is to get infected). Once infected, the virus can then affect individuals differently—a sliding scale from asymptomatic (no symptoms) to severely debilitating illness (on a ventilator). Furthermore, some infected individuals are suspected to be super-spreaders of the virus, while others may not spread the virus as much.

We wrap our head around these complex and highly variable narratives using verbal reasoning in the form of if-then statements. Such logic operations begin to imply the use of probability; the chance of any given outcome in relation to a given/known scenario. Therefore, mathematical models are essential in evaluating the potential outcomes of all of these nuanced situations.

Mathematical models quantify these processes and their resulting interactions by stripping away unnecessary detail from a problem and focusing on the aspects of interest. It is crucial to understand that mathematical models are models; therefore, by definition, they are idealized caricatures of reality. Perhaps the most useful example in this context is a map. A map is a model of the world that hides extraneous details of the world (like the location of every tree) and focuses on the useful aspects (such as roads and gas stations). Depending on the purpose of the map, the scale (within a city vs. across the country) and the focus of the map (topography vs. roads) can be completely different. While the most realistic map of the world is the real-world itself (and everything within it), that map would be fairly useless if you were trying to get specific directions from East Lansing to Kalamazoo.

In the context of epidemics, the simplest models (known as the Susceptible-Infected-Resistant (SIR) models) assume that the population is infinitely large in size, is spread uniformly across space, and all individuals in a population are the same except for the presence or absence of infection. The models then divide the population into three categories – susceptible, infected, and recovered individuals.

To understand how disease spreads through a population, these models then focus on these categories and see how they change in time under different conditions. This is of course far removed from reality where population sizes are finite, where locations of epidemics are often clustered in specific communities, and where individuals can vary due to age, sex, and other health conditions. However, while there are other models which incorporate these additional factors, it is important to realize that the SIR model is quite deliberately simplistic.

The simplicity of the SIR model allows it to glean insights which are general in nature and would apply to any disease which results in susceptible, infected and resistant individuals. Thus, models like SIR are used to gain understanding about the spread of diseases and are not meant to be predictive. Predictive models are more realistic, and include details specific to the disease at hand. Unfortunately, as seen in the early days of COVID-19, simplistic models are often misused for predictive purposes. This problem is often frustratingly exacerbated by poor and inaccurate models highlighted by the Trump administration and the media. For example, the White House Council of Economic Advisers used a particularly bad graphic to seemingly forecast that there will be no more COVID-19 deaths by mid-May—a grossly inaccurate assessment using the IMHE’s projections (who created the fairly accurate prediction for August 4th mentioned earlier).

While the charge was later denied by the White House, this remains a pertinent example of bad science combined with poor communication. So, what can non-experts do in a situation like this?

1. Trusting the experts and listening to their advice goes a long way. Experts are typically trained in developing and analyzing these models, and only offer cautious advice.

2. Focus on the assumptions of the cited models. Particularly for models being used for predictions, it is important that they capture significant aspects of the problem at hand. If a model does not carefully outline its assumptions or if they make assumptions that are wildly inaccurate, there is cause for concern.

3. Predictions have errors built into them. Look for them when analyzing forecasts. Treat startling predictions with caution, and demand stronger evidence for bolder claims.

4. Finally, get your science news from reliable news sources where sound journalistic practices like fact-checking and obtaining second opinions by independent scientists are practiced.

If not criticized and caught early, bad models can drive dangerous policy decisions. However, when using caution and rigor when analyzing mathematical models, they become excellent tools for gaining insights into complex systems and can be used to guide policy development.

RAVI RANJAN set out to be an engineer, but got lost on the way and is now trying to use math to solve problems in ecology and evolution. When not trying to finish his PhD, he is often found trying to convince people that huskies are royalty and should be treated as such.