## Ambiguous notation for logarithms

The motivation for this post is to respond to some questions about a recent video presentation titled, “Why you haven’t caught Covid-19 [sic],” presented by Anne Marie Knott, a professor in the Washington University St. Louis Olin Business School. The gist of the presentation is an argument against the “non-pharmaceutical interventions,” or stay-at-home orders, etc., in response to the current pandemic.

I am not interested in arguing about government policies, or even epidemiological models here. Frankly, this video is too easy a target. The error made in this video is a mathematical one– an error so simple, and yet so critical to the presenter’s argument, that it’s not worth bothering with the remainder of the presentation. Instead, I’d like to use this video as an excuse to rant about mathematical notation.

The problem starts at about 3:38 in the video, where the presenter attempts to analyze the COVID-19 outbreak on the aircraft carrier USS Theodore Roosevelt as a realization of the so-called “final size equation,” a model of the end-game, steady state extent of an epidemic in a closed system (since the sailors were isolated onboard the ship for a significant period of time). The final size equation is $p = 1-e^{-R_0 p}$

where $p$ is the “final size” of the pandemic, or the fraction of the population that is eventually infected, and $R_0$ is the basic reproduction number, essentially the average number of additional people infected through contact with a person already infected, in the situation where everyone in the population is initially susceptible to infection.

As the presenter explains, there is a critical difference between a reproduction number less than one, resulting in “extinction” of the disease, and a value greater than one, resulting in an epidemic. Using the fact that 856 of the 4954 sailors onboard the Roosevelt eventually tested positive for COVID-19, corresponding to $p=856/4954$, we can estimate $R_0$ by solving for it in the final size equation, yielding $R_0 = -\frac{\ln (1-p)}{p}$

It’s a simple exercise to verify that the resulting estimate of $R_0$ is about 1.1. It’s also a relatively simple exercise to verify that this estimation technique cannot possibly yield an estimate of $R_0$ that is less than one.

Despite this, the presenter manages– conveniently for her argument that the contagiousness of the virus is overblown– to compute a value of $R_0$ of about 0.48… by computing the base 10 logarithm $\log_{10}(1-p)$ instead of the natural logarithm $\ln(1-p)=\log_e(1-p)$ in the formula above.

It’s interesting to try to guess how the presenter managed to make this mistake. My guess is that she did this in an Excel spreadsheet; that is the only environment I know of where log(x) computes the base 10 logarithm. In any other programming environment I can think of, log(x) is the natural logarithm, and you have to work at it, so to speak, via log10(x), or log(x)/log(10), to compute the base 10 logarithm.

The mathematical notation situation is a bit of a mess as well. Sometimes I’m a mathematician, where $\ln x$ means the natural logarithm, and any other base is usually specified explicitly as $\log_b x$. But sometimes I am an engineer, where $\log x$ usually means base 10, but sometimes in a communications context it might mean base 2. Other times I am a computer scientist, where $\lg x$ is a common shorthand for base 2, and $\log x$ can mean pretty much anything, including “I don’t care about the base.”

This entry was posted in Uncategorized. Bookmark the permalink.

### 1 Response to Ambiguous notation for logarithms

1. Wilmington says:

If you consider a TI-30X IIS calculator to be an “environment”, then “LOG” on it computes the base 10 logarithm. But what’s that button marked “LN” right below it? 🙂

Same on the TI-30XS and on the nicest calculator I ever owned, the HP 41CV.
On a TI BA II Plus (financial calculator), however, she couldn’t make this mistake because it has only the natural log.

This site uses Akismet to reduce spam. Learn how your comment data is processed.