The lead article in yesterday’s Post described some interesting analysis of one of Thomas Jefferson’s early drafts of the Declaration of Independence. Although several words can be seen to have been crossed out and replaced with others, in one instance where Jefferson initially used the word subjects, he did his best to not just cross it out but actually erase it and write citizens in its place. (I assume the part in question is the following final version: “He has constrained our fellow Citizens taken Captive on the high Seas to bear Arms against their Country…”)
In a similar spirit of this holiday weekend, I can’t resist mentioning another interesting application of mathematics, in this case to the problem of determining authorship of the so-called “disputed” Federalist Papers. This isn’t new stuff; however, I like problems (and solutions) like this because of the relative simplicity with which the ideas may be explained… while at the same time there is some meaty mathematics under the surface. It is the kind of problem that has the potential to excite and challenge students.
The Federalist Papers are a collection of essays written by (variously) Alexander Hamilton, James Madison, and a few by John Jay, in support of ratification of the U. S. Constitution. Although authorship of most of these essays is relatively certain, there has been some debate about twelve in particular. These “disputed papers” are today generally all thought to have been written by Madison.
My first exposure to this problem was a 1998 paper by Bosch and Smith. (Unfortunately, this JSTOR link is not accessible without a journal subscription.) In it, the authors describe the idea of using “separating hyperplanes” to identify the author(s) of the disputed papers. They compute, for each of the essays, a point in 70-dimensional space, with each coordinate indicating the frequency of occurrence of a corresponding “function word.” Think of these function words and their frequencies of use as a “fingerprint” that is unique to a particular author.
Now, considering only the 65 points corresponding to the undisputed papers, the authors compute a “separating hyperplane,” or a hyperplane such that all of the points corresponding to Hamilton’s essays are on one side, and Madison’s on the other. (This is where the interesting mathematics comes in; how do you compute such a separating hyperplane? Under what conditions does a separating hyperplane even exist? In the likely case that there is an entire infinite family of possible separating hyperplanes, how much does it matter which one you choose?)
Anyway, given such a hyperplane separating the two authors of the undisputed papers, the authorship of the disputed papers may be determined by observing on which side of the hyperplane the corresponding points fall. It turns out that this approach yields the same conclusion, that all 12 were written by Madison.