Looking Up

Can learning and using another language change or even improve the way you think?  This question first came up after reading a fascinating article last week about experiments in cognitive psychology, considering natural languages.  My goal is to connect these ideas with computer science, considering programming languages.  In both cases, there is evidence suggesting that the answer to the question above is yes.

First, I recommend reading “How Language Shapes Thought,” by Lera Boroditsky, an assistant professor of cognitive psychology at Stanford.  The article is less than three pages, and contains descriptions of several experiments that suggest the extent to which a person’s language affects their ability to perceive the world, perform tasks, etc.

(I find observations from experiments like these to be very interesting… and almost spooky.  We humans have grown used to the idea that our bodies are made up of physical processes, that we are at times like cars with a knock, and medical doctors are like tinkering mechanics.  But many people grow much more uncomfortable at even the suggestion that their brains, their minds, might similarly be “merely” physics, nothing more than a deep pocket of low entropy.)

For example, consider the Kuuk Thaayorre language spoken in northern Australia, that does not contain words for relative directions like “left” or “right.”  Communicating relative spatial relationships in this language always involves absolute cardinal directions like north, east, etc.  The result of this “restriction” is interesting, as Boroditsky writes:

“… People who speak languages that rely on absolute directions are remarkably good at keeping track of where they are, even in unfamiliar landscapes or inside unfamiliar buildings. They do this better than folks who live in the same environments but do not speak such languages and in fact better than scientists thought humans ever could. The requirements of their languages enforce and train this cognitive prowess.”

The idea is pretty simple: our language is not separate from our thought process.  Certainly how we think affects how we create, modify, and evolve our means of communication; but the converse appears also to be true.

Now let us switch gears, and move from cognitive psychology to computer science.  It occurred to me that one of Paul Graham’s essays, Beating the Averages, describes a very similar phenomenon involving programming languages.  (If you have not read any of Graham’s stuff, look farther than just the link above.  He has written a lot, and almost all of it is interesting reading.)

As with natural language, everyone has a favorite or most commonly used programming language.  That favorite language is almost certainly not machine language (at least I hope not).  And everyone can probably explain why his or her favorite language is “better” than machine language: because machine language lacks some particular nice feature, like vectorized arithmetic operations, or list comprehensions, or macros, or whatever.

This is because machine language is at the bottom of what Graham describes as a “continuum” of expressive power.  We know how to appreciate our own language’s advantages over machine language because we know how to talk about what machine language lacks.  But– and this is the key point– what happens if we try to compare our favorite language with another language that is “higher” on the continuum of expressive power?  As Graham puts it (replacing “Blub” with the programming language of your choice):

“As long as our hypothetical Blub programmer is looking down the power continuum, he knows he’s looking down. Languages less powerful than Blub are obviously less powerful, because they’re missing some feature he’s used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn’t realize he’s looking up. What he sees are merely weird languages. He probably considers them about equivalent in power to Blub, but with all this other hairy stuff thrown in as well. Blub is good enough for him, because he thinks in Blub.”

To wrap this up, my goal was simply to point out some interesting experiments in cognitive psychology, and to compare observations from those experiments with similar observations in computer science.  But I think those observations suggest a potential clear advantage of learning a new language.  Whether it is Spanish or Scheme, Portugese or Python, we can not only learn something, but we might even improve– or perhaps abandon– how we use our “native” language.

It Does, In Fact, Take Two to Tango

This week I want to discuss two interesting subjects: sex and graph theory.  At the same time.

Consider these rather intentionally vague questions as motivation: do men tend to have more sexual partners than women?  Or the other way around?  In either case, how can we appropriately measure any differences?

It appears I am coming to the subject late, but as we will see, some important observations still seem to have been missed.  My interest was first piqued by the following minor note hidden in the most recent issue of the College Mathematics Journal:

“Megan McArdle attempts to illustrate her point that survey respondents are unreliable (“Misleading Indicator,” November Atlantic) by telling us that it is mathematically impossible for men to report an average number of female sexual partners that is much higher than the average number of male partners reported by women.  I agree that survey respondents are unreliable, but so is McArdle’s math.  It may be unlikely that the number of partners reported by honest males would be higher, but it is not mathematically impossible.” – Fred Graf, Concord, NH

This seems to be a continuation of debate that began at least several years ago, stemming from a couple of studies with some interesting observations.  The following is from a 2007 New York Times article about the studies:

“One [CDC] survey, recently reported by the federal government, concluded that men had a median of seven female sex partners. Women had a median of four male sex partners. Another study, by British researchers, stated that men had 12.7 heterosexual partners in their lifetimes and women had 6.5.”

There are two questions worth asking here: (1) do either of these results make sense, and (2) what conclusions may be drawn from them?

Much of the past discussion has focused on the following argument that the results do not make sense: suppose that there are m men and w women.  Model the heterosexual partnerships between them with a bipartite graph, with a vertex for each of the m men in one partite set, a vertex for each of the w women in the other partite set, and an edge between vertices representing the two corresponding people having had sex with each other in their lifetimes.  Let t be the total number of edges in this graph (i.e., the total number of sexual partnerships).  Then the average number of female partners per male is t/m, and the average number of male partners per female is t/w.

If m = w, that is, if there are an equal number of men and women, then these averages are the same.  More importantly, even if m \neq w, then the averages differ only by the ratio of the two population sizes, not by any measure of promiscuity of one or the other.

Given this, how can we explain the nearly 2:1 ratio suggested by the British study?  And what can be said about the difference of medians (7 for men, 4 for women) indicated by the CDC study?  (There was some unfortunate confusion about the distinction between mean and median that generated more heat than light; you can read more at this apparently now-abandoned blog.)

Regarding the British study, there are really only three possibilities:

  1. There are nearly twice as many women as men in the surveyed population.  I am reasonably confident that this is not the case.
  2. Survey respondents reported partnerships with persons outside the population being sampled.  Prostitution (men reporting sex with prostitutes that were not themselves surveyed) and survivor bias (reporting partners who are now dead and thus were obviously not surveyed) are two specific explanations along these lines.  But I think neither is sufficient to explain the extent of discrepancy.  For prostitution to be the primary cause of the difference, for example, every man in the survey would have to report an average of about 6 encounters with prostitutes!  (Or half of the men would need to report about 12 encounters, etc.)
  3. Respondents were simply dishonest.  I think this is by far the most plausible explanation, as suggested by McArdle, as well as by David Gale, the mathematician at Berkeley who contributed to the debate.  Men tend to inflate their sexual histories, and women tend to minimize theirs.

Finally, regarding the CDC survey, reporting the median vs. the mean does have the potential to provide more insight into what is happening… but I hesitate to trust these results at all, for two reasons.  I wonder how many people bothered to actually read the paper in question, available here.  In the main text of the paper, the medians of 7 and 4 for men and women, respectively, are indeed mentioned.  However, close inspection of the more detailed tables at the end of the paper reveals some interesting things.

First, the median numbers of partners are given as… 6.8 and 3.7, respectively.  Did the authors round these values to 7 and 4?  More importantly, I admit to being confused about how the median of a set of integers can be anything other than another integer or an integer plus 0.5.

Also, a footnote below the tables indicates that “median values exclude [respondents] with no… sexual partners.”  It is not clear to me why this was done, particularly considering that one of the categories into which responses were grouped was “0-1 partners.”

In summary, I think it is certainly interesting to consider how men’s and women’s sexual experiences differ, “in distribution,” so to speak; how men and women view their sexual histories; and how they share and/or distort them with each other.  But I don’t think either of the studies discussed here succeed in shedding much light on those issues.

Yet Another Rubik’s Cube

Last week I had a conversation at work about ideas for programming projects for students, and the Rubik’s cube was suggested as one idea.  There are many possible approaches to this problem, from merely modeling the cube and allowing a user to manipulate it, to actually solving the cube in some automated way.

Part of the context of the discussion was the use of Visual Python for manipulating 3D objects, which I have been using with some success for a couple of months now underneath my robot simulator.  To provide an example of how VPython alone might be useful in the classroom, I decided to write a simple model of the Rubik’s cube.  I was frankly surprised at just how simple this turned out to be; the end result was just a few dozen lines of code, collapsed here:

#!/usr/bin/env python

from visual import *

fps = 24

# Map keypresses to corresponding face colors and normal vectors.
faces = {'r': (color.red, (1, 0, 0)),
         'o': (color.orange, (-1, 0, 0)),
         'y': (color.yellow, (0, 1, 0)),
         'b': (color.blue, (0, -1, 0)),
         'w': (color.white, (0, 0, 1)),
         'g': (color.green, (0, 0, -1))}

# Create colored stickers on each face, one cubie at a time.
stickers = []
for face_color, axis in faces.values():
    for x in (-1, 0, 1):
        for y in (-1, 0, 1):

            # Start with all stickers on the top face, then rotate them "down"
            # to the appropriate face.
            sticker = box(color=face_color, pos=(x, y, 1.5),
                          length=0.98, height=0.98, width=0.05)
            cos_angle = dot((0, 0, 1), axis)
            pivot = (cross((0, 0, 1), axis) if cos_angle == 0 else (1, 0, 0))
            sticker.rotate(angle=acos(cos_angle), axis=pivot, origin=(0, 0, 0))
            stickers.append(sticker)

# Get keyboard moves and rotate the corresponding face.
while True:
    key = scene.kb.getkey()
    if key.lower() in faces:
        face_color, axis = faces[key.lower()]
        angle = ((pi / 2) if key.isupper() else -pi / 2)
        for r in arange(0, angle, angle / fps):
            rate(fps)
            for sticker in stickers:
                if dot(sticker.pos, axis) > 0.5:
                    sticker.rotate(angle=angle / fps, axis=axis,
                                   origin=(0, 0, 0))

And here is a screenshot:

rubik

Rubik’s Cube with VPython.

Much of the user interface comes for free with VPython, including rotating the cube and zooming in/out with the mouse.  (Right-click-and-drag to rotate, or right/left-click-and-drag to zoom in or out.)  My simple approach to specifying moves was to identify each face with the color of its center “cubie,” which does not move.  Press r to rotate the “red” face clockwise, or R to rotate it counterclockwise; similarly for orange, yellow, green, blue, and white.

I found this project interesting because the Rubik’s cube and its solution usually involve some group and/or graph theory to represent states of the cube and the moves from one state to another.  In this case, however, the implementation is sort of purely geometric: instead of explicitly modeling the permutation of the cubies, we just keep track of the position and orientation of the colored “stickers” on the cubies, so that turning a face of the cube simply corresponds to rotating some 3D objects… which makes smooth animation pretty easy, too.

Of course, this approach has some disadvantages as well.  For example, representing the cube this way would probably not be the most suitable for an automated solver.  But my initial goal was simply to show the potential bang:buck ratio from using VPython in particular, and Python in general.  And I had some fun in the process.

Fractional Representation

I suppose this post is at least in part a book review.  I recently finished reading Numbers Rule: The Vexing Mathematics of Democracy, From Plato to the Present, by George Szpiro.  If you have an interest in government, not just in theory but in practical implementation, then I highly recommend this book.  It contains a chronological progression of our attempts to govern ourselves, and the fascinating and frustrating problems that can arise… or in many cases, are guaranteed to arise.

The book addresses two main challenges, both of which have been discussed here before.  The first challenge is how a group can collectively make decisions, elect leaders, etc.  This story has an interesting cast of characters.  For example, Lewis Carroll proposed a very interesting election method that has a lot of theoretical advantages in its favor… if not for the fact that actually implementing the method turns out to be NP-hard.  (As a side benefit, I learned about a new complexity class, \Theta_2^p, or “parallel access to NP,” for which this problem is complete.)

The second challenge addressed in the book is the focus of this post: how to apportion representatives.  Just last month, the U. S. Census released its updated state population data used to reapportion the 435 seats in the House of Representatives, as directed by the Constitution: “Representatives shall be apportioned among the several States according to their respective numbers.”  Unfortunately, this is not easy to do; when trying to allocate integral numbers of representatives to those 435 seats, some states inevitably end up with greater or less voting power in Congress than other states.  With this latest 2010 reapportionment, residents of Rhode Island will have the greatest per capita representation, about 1.88 times that of the most under-represented residents of Montana.

The history of this problem and its many proposed solutions are described with great detail and not a little humor in Szpiro’s book.  (Before you think this is a small problem, keep in mind that there are debates and lawsuits almost every ten years; in the 1920s, Congress had such a hard time with this problem that they directly violated the Constitution by simply giving up and leaving the apportionment as it had been for the previous 10 years.)

But one potential solution is mentioned only in passing, one that I think deserves more attention: why must we stick to nice, round, whole numbers?  Why not allow some– or even all– representatives to have a fractional, or at least non-integral, vote?

This is not as crazy, nor as complicated, as it sounds.  Szpiro suggests simply adding a few “fractional” representatives, at most one for each state, each of whose vote corresponds to the fractional “left over” population that causes all of the paradoxes and problems with most of the apportionment methods.  But I think a cleaner approach would be to let all of the representatives have a non-integral vote, in exact proportion to the actual population of the corresponding district.

As is usually the case with my ideas, this one is not new, being most recently proposed and described in some detail in this 2008 article by Temple law professor Jurij Toplak.  The article suggests leaving the composition of the House the way it is, even using the current Huntington-Hill method of apportionment of warm bodies, although I don’t see why this is necessary.  That is, states could even choose the number of their representatives, within certain limits, based on their desired “expressive power” of diversity of opinions within their state.  Of course, this could present more interesting districting problems, as we have also discussed here before.  One step at a time, I suppose.

Is the Earth Like a Billiard Ball Or Not?

This is the second time I have come across this particular anecdote in a few months, so I thought I would weigh in on it as well.  Last night I saw on Reddit that “The Earth is relatively smoother than a billiard ball.”  This was a reference to a Wikipedia article about the shape of the Earth, quoted here in case someone gets around to fixing it:

“Local topography deviates from this idealized spheroid, though on a global scale, these deviations are very small: Earth has a tolerance of about one part in about 584, or 0.17%, from the reference spheroid, which is less than the 0.22% tolerance allowed in billiard balls.”

A few months ago, I also came across a Discover Magazine blog post from 2008 titled “Ten things you don’t know about the Earth.”  The first two “things” on the list dealt with this same issue of the shape of the Earth compared to that of a billiard ball.  The article correctly recognized the need to distinguish between smoothness and roundness, but in my opinion still came up with a wrong answer on both counts.

Before getting started, why do we care about any of this?  The motivation for this anecdote begins with the fact that the Earth is not a perfect sphere, viewed on either a large or small scale.  On a large scale (think roundness), because the Earth is spinning, its shape is best approximated not by a sphere but by an ellipsoid– specifically, an oblate spheroid, with a larger radius at the equator than at the poles.  On a smaller scale (think smoothness), the Earth is more obviously seen to not be a perfect sphere, nor is it a perfect ellipsoid, since it has all sorts of ridges, grooves, etc. corresponding to mountains, rivers, ocean trenches, etc.

The question is, how “non-spherical” is it?  The comparison with a billiard ball would be interesting if it were true, because to us a billiard ball seems to be a nearly perfect sphere, at least to the naked eye.  The source of this comparison– and, I think, the source of much of the confusion– is what the World Pool-Billiard Association has to say about a regulation billiard ball:

“All balls must be composed of cast phenolic resin plastic and measure 2-1/4 (+/-.005) inches [5.715 cm (+/- .127 mm)] in diameter.”

The key observation, I think, is that this description has nothing whatever to say about the smoothness of a billiard ball.  It does not mean, as the Discover article states, that a billiard ball “must have no pits or bumps more than 0.005 inches [sic] in height.”  Even such a small pit, bump, or groove would be easily noticeable on a billiard ball; manufacturing capabilities and requirements for smoothness are on the order of microns, much less than 0.005 inch.  (I emailed the WPA asking for clarification on this requirement, but have so far received no response.  I wonder if they get a lot of questions about this.)

The Wikipedia entry makes the same mistake, comparing the 0.22% (0.005/2.25) “tolerance” of a billiard ball to the 0.17% figure corresponding to the largest deviation of the Earth’s surface from the reference ellipsoid.  The latter almost certainly refers to the Mariana Trench, 10,911 m below sea level.  (Actually, this figure should be 0.0855%, not 0.17%, since the referenced billiard ball tolerance is relative to its diameter, not its radius.)

In any case, before we can comment on the smoothness of the Earth compared with a billiard ball, I think we require more information on either WPA rules or manufacturing standards.  [Edit: Thanks to commenter Mark Folsom for providing the following clarification of just how smooth a billiard ball is:

“125 microinches rms is a really rough surface–much more so than any billiard ball I have seen. In my estimation, a new billiard ball has a surface finish no worse than 32 microinches…”

Comments on the Bad Astronomy post give similar estimates.  And “Dr. Dave” provides some actual measurements with photos and plots showing deviations of approximately 20 microinches.  In comparison, at the scale of a billiard ball, the Mariana Trench is a groove almost 2000 microinches deep. So it seems the Earth is nowhere near as smooth as a billiard ball.]

So let us move on to roundness.  The following is quoted from the Discover blog:

“If you measure between the north and south poles, the Earth’s diameter is 12,713.6 km. If you measure across the Equator it’s 12,756.2 km, a difference of about 42.6 kilometers. Uh-oh! That’s more than our tolerance for a billiard ball. So the Earth is smooth enough, but not round enough, to qualify as a billiard ball.”

First, a minor nit: neither of the quoted diameters is correct to the given number of significant digits.  But that will not affect our calculations here.  What the article seems to miss is that the stated tolerance of a billiard ball diameter is plus or minus 0.005 inch.  That is, the diameter may be as small as 2.245 inches, or as large as 2.255 inches.  Enlarging this 0.01 inch difference to the scale of the Earth, the allowable difference in diameters is about 56.6 km, more than the actual difference of 42.8 km.  So the Earth is indeed as round as a regulation billiard ball.

Having said all this, I think this entire analysis abuses the spirit of the law, so to speak.  The WPA probably does not intend to allow such ellipsoidal billiard balls onto pool tables around the world, but rather to allow some variability in the size of nearly-spherical balls.  That is, the intent of the regulation is more likely that a ball should be spherical with a fixed diameter, but that diameter may be 2.245 inches for one ball, and 2.255 inches for another ball.

In summary:

  1. Is the Earth as smooth as a billiard ball?  Answer: I’m not sure.  The question may be rephrased by comparing again with the Mariana Trench: can one detect a 49-micron groove in a billiard ball, and if so, would it be acceptable to play with?  [Edit: As mentioned in the edit above, I think the answer is now a definite No.]
  2. Is the Earth as round as a billiard ball?  Answer: technically, yes… but you probably wouldn’t want to play with it.