During the last couple of weeks, I have been spending a lot of time re-visiting a problem that first attracted my interest almost 15 years ago: casino blackjack. My interest was renewed just lately after a friend at work directed me to an online message board where people were discussing some software that I had written for playing and analyzing the game. After getting involved in the discussion, I ended up making some significant improvements and additions to the code, and learning some interesting mathematics in the process.
The software is available at the usual location here. Everything but the pre-built executables is also on Bitbucket:
hg clone http://bitbucket.org/possiblywrong/blackjack
There are two programs in the release zip archive; one is a game/trainer, that lets you play blackjack with customizable rule variations, number of decks, etc. Here is a screen shot:
As you can see, this was a long time ago. The graphics are thanks to the Allegro game programming library.
The other program is a strategy calculator, that generates a printable strategy table for particular rule variations, and allows you to inspect expected values for specific player hands.
Although there are a lot of blackjack software packages out there, I think a few features combined make this one unique:
1. It’s exact. (At least, it is now– more on this shortly.) Both the game and strategy calculator use the same underlying “engine” to compute exact probabilities and expected values based on combinatorial analysis, as opposed to Monte Carlo simulation. This is where the interesting mathematics lives, and is the main reason I enjoy studying this and other casino games. (Unfortunately, I am not actually much of a gambler.)
2. It’s fast. Although traversing the “tree” of possible outcomes of a particular hand is pretty simple in principle (e.g., a recursive function of a dozen lines or so would compute the dealer probabilities), doing so quickly is another matter. The game is fast enough to allow “perfect card counting,” continuously updating the exact player edge (vs. a “true count” estimate) prior to each hand. This is where the interesting computer science lives, and is the other reason I continue to re-visit the game.
3. It’s free. Executables (pre-built for Windows) and complete source code are available for anyone who wants to use them, subject to the GPL. The code for the computational engine is separate from any external interface, and so may be used in other applications. (I know of a few instances where people have done just that; for example, an old blackjack game in the GNOME distribution used this engine.)
But about that claim of “exact.” It turns out that blackjack would be a relatively simple game to analyze, if not for one particular feature: splitting pairs. Splits greatly complicate analysis of the game; there are many challenges with even specifying a strategy involving splits, let alone evaluating or optimizing such a strategy. For example, should the player be allowed to consider all cards among all split hands in a round when making decisions, or just the cards in the current (possibly split) hand? The former would be optimal but too hopelessly complex to memorize, while the latter is realistically playable but sub-optimal. And there are more possibilities in between these two extremes.
The most significant of these latest improvements is to the algorithm for computing expected values for re-splitting pairs. Prior to these changes, split values were exact only when no re-splitting is allowed; i.e., a pair may only be split once. When re-splitting is allowed (usually up to a maximum of four split hands), my previous algorithm only approximated the corresponding expected values.
With these latest changes, the code now computes exact expected values for re-splitting as well, for several different choices of complexity of “post-split” playing strategy. The details of the algorithm are described in this paper. (Most of this paper is over eight years old; the last section is the recent addition that allows an efficient implementation.)
Finally, recall the red/black card puzzle from several months ago? That solution turns out to be useful once again here, by providing an easy test for proving that a particular algorithm for computing expected values– in almost any card game, not just blackjack– is either incorrect or an inaccurate approximation. The idea is as follows: consider using some algorithm to compute the overall expected return from a single round of blackjack, played from the top of a full shoe, and the corresponding playing strategy that realizes that expected return. (For example, the expected return for a typical 6-deck game is -0.615389123%.)
Now recompute overall expected return again, using that same playing strategy, but this time using a shoe with one ace removed. Similarly, compute the expected value with one two removed, then with one three removed, etc., all with the same playing strategy.
The “Extended” True Count Theorem (first proved by Thorpe, generalized slightly here) says that if we compute the average of these expected values, weighted by the probability of each corresponding removed card, this should equal the overall expected value computed for the full shoe. Put another way, the average “effect of removal” should be zero.
So if using the same algorithm to compute the overall expected value in these two different ways yields two different answers, then the algorithm must be incorrect. Of course, the converse is not true: equality does not imply correctness. But I have still found this to be a useful additional sanity check and debugging tool when working on algorithms like this one.
Edit: This is some interesting simple analysis using the strategy engine. The following plots show expected return versus shoe penetration for both basic and optimal strategy. For each plot, a single simulated player played 500 shoes. On the left (think “perfect card counting”), the player used the same composition-dependent, “full-shoe” basic strategy for every hand. On the right (think “perfect card counting and perfect play“), the player used a different strategy continuously re-optimized for each hand, based on the current depleted shoe.
Each blue point indicates exact expected return (i.e., not just a true count) for the corresponding depleted shoe prior to each hand. The red curves indicate 10th percentiles of the distribution of expected return.