## An optimal stopping card game

Let’s play the following simple game: I will shuffle a standard 52-card deck, and deal one card at a time face up onto the table between us.  You can say “Stop” at any time, at which point I will pay you an amount equal to the fraction of cards dealt so far that are red.  For example, if you stop after seeing a single card, you win 1 if it is red, 0 if it is black, with expected return 1/2.  Of course, you can wait until the entire deck is dealt, also with a guaranteed return of 1/2.

The problem is: can you do better than this?  What is the optimal strategy for playing this game, and what is the corresponding expected return?

(This game is a variant of the following interesting game that I read about this week, which at first glance might seem even simpler, but for which optimal strategy remains an open problem: instead of dealing cards from a deck, I flip a coin repeatedly, and upon stopping I pay you an amount equal to the fraction of tosses that came up heads.)

Posted in Uncategorized | 10 Comments

## Floating-point equality: It’s worse than you think

While recently helping to fix some buggy simulation software, I encountered a nasty problem involving comparison of floating-point values.  I was vaguely aware of the potential for this sort of problem to occur, but this was the first time that I had actually seen it in the wild.

The program was an event-driven simulation, with a periodic “update” event that was supposed to occur, well, periodically (5 times per simulated second).  But it would consistently hang after a couple of minutes, getting stuck in an infinite loop because the periodic update event stopped triggering.  Here is what the code looked like, with the actual time values that caused the problem:

double curr_time = 128.1253990141936;
double prev_time = 127.9253990141936;
if (curr_time >= prev_time + 0.200) {
update();
} else {
// We shouldn't be here.
}


The if-condition was expected to evaluate to true, indicating that enough time had passed since the previous update.  But the condition failed, thus preventing the update and associated time advance.  See if you can identify the cause of the problem.

At this point, the usual from-the-hip diagnosis is, “Floating-point arithmetic is only an approximation; you should never test values for equality, only for differences within some tolerance.  You should probably read What Every Computer Scientist Should Know About Floating-Point Arithmetic, by David Goldberg.”

This is mostly good advice… but when reading online discussions about this issue, I often get the impression that part of Goldberg’s story is missed.  That is, it is certainly important to recognize that floating-point values have limited precision, so that arithmetic and comparisons that we might expect to be valid for real numbers on a cocktail napkin are not valid when evaluated using limited floating-point precision.  For example, evaluating the expression $1.1+2.2==3.3$ typically yields false, since none of the three values can be represented exactly as a rational number with a denominator that is a power of 2.  So instead of the equality you want, the computer is instead thinking:

$\frac{2476979795053773}{2251799813685248} + \frac{2476979795053773}{1125899906842624} \neq \frac{3715469692580659}{1125899906842624}$

But in the case of the simulation bug described above, this wasn’t the problem.  To see this, let’s add some debugging output to the program:

double curr_time = 128.1253990141936;
double prev_time = 127.9253990141936;
if (curr_time >= prev_time + 0.200) {
update();
} else {
double test_time = prev_time + 0.200;
std::printf("curr_time = %a\n", curr_time);
std::printf("test_time = %a\n", test_time);
}

Output:
curr_time = 0x1.004034p+7
test_time = 0x1.004034p+7


The two values being compared are exactly equal!  That is, both the current time (128.1253990141936) and the test time (the sum 127.9253990141936+0.200) are represented by exactly the same 64 bits in the underlying IEEE-754 double-precision representation.

So what’s going on?  If two floating-point values are indeed equal, how can they fail a weak inequality comparison?

It turns out that the problem is not that precision is limited, but that precision is varying, and varying in ways that the programmer cannot predict or control.  In the if-condition, the compiler has decided to evaluate the sum on the right-hand side in extended precision (something more than 64 bits, probably 80 or 96), and then perform the comparison, still in that extended precision register, without first rounding back down to 64-bit double precision.  As a result, those additional extended-precision bits make the sum slightly greater than the “truncated” double-precision value of the current time on the left-hand side.

This can be a particularly nasty problem to troubleshoot, because the behavior is very dependent on platform, compiler, and even “code context.”  (In this case, the simulation was running on Linux using gcc 3.4.4.  I have been unable to reproduce the problem anywhere else.)  By “code context” I mean that the compiler has a lot of leeway in determining if/when to jump back and forth between extended and double precision, and those decisions can change depending on surrounding code!  For example, we “masked” the behavior somewhat by our unsuccessful attempt at debugging output above, since the compiler is forced to eventually round the sum and store it in a double-precision variable so that we can print it.

Goldberg does describe this issue, although the reader must make it 80% of the way through a 70-page paper to find it.  But the C++ FAQ also contains a section (Cline calls it the “newbie section,” but it’s really more of an attic for anecdotes without a better home) that I think does a great job of describing the problem as well.

Edit: The Random ASCII blog has a great series of articles on floating-point issues.  The one most directly relevant to the problem discussed here is titled “Floating-Point Determinism,” in particular the section on composing larger expressions.

Posted in Uncategorized | 8 Comments

## Sudoku Solver

This is in competition for my most uninteresting post.  But it’s hopefully a useful snippet for those few with whom I’ve had recent discussion about Sudoku, and algorithms for solving– and more interestingly, in my opinion, generating– instances of the puzzle.

Writing a program to solve Sudoku puzzles seems to have become a standard programming exercise, whether as an experienced developer learning a new language, or as a beginner learning about, say, recursion and depth-first search.  (In my day it was the Eight Queens Puzzle.)  It’s a nice problem; the backtracking part is relatively straightforward, but the test for whether a candidate solution is valid is slightly more complex, or at least potentially awkward, depending on the language at your disposal.

Following is my implementation in Mathematica, that attempts to combine “short” with “still readable”:

ValidSudokuQ[s_, n_, {i_, j_}] :=
And @@ Map[
Count[#, n] <= 1 &,
{
s[[i, All]],
s[[All, j]],
Partition[s, {3, 3}][[Ceiling[i/3], Ceiling[j/3]]] // Flatten
}
]
\$RecursionLimit = Infinity;
SolveSudoku[s_] :=
Module[
{p, r, n},
p = Position[s, 0];
If[p === {},
Print[s],
p = First[p];
Do[
r = ReplacePart[s, n, p];
If[ValidSudokuQ[r, n, p], SolveSudoku[r]],
{n, 9}
]
]
]


We can clean this up a bit with some typesetting, to print the solutions in their usual form:

ShowSudoku[s_] :=
GridBox[s,
GridFrame -> True,
ColumnLines -> {True, True, 2, True, True, 2, True},
RowLines -> {True, True, 2, True, True, 2, True}
] // DisplayForm


For example, starting with a completely empty grid, the above algorithm yields the “first” (i.e., lexicographically least) Sudoku instance:

The “first” Sudoku puzzle.

## “Cold hit” DNA profiling

I recently read the book Math on Trial: How Numbers Get Used and Abused in the Courtroom, by the mother-daughter team of Leila Schneps and Coralie Colmez.  It’s an interesting, easy, and short read, just a little over 200 pages, with each chapter being a mostly self-contained description of a legal case where mathematics was involved in one way or another… and in particular, where mathematics was potentially misused, misunderstood, etc.

For example, one of the chapters discusses the case of the murder of Meredith Kercher, which has new relevance given Amanda Knox’s recent re-emergence into the media spotlight as the comedy of the Italian court system re-tries her case.  In my opinion, the authors seem to clearly think that both Knox and her then-boyfriend, Raffaele Sollecito, are guilty, and much of the focus of the discussion is on the mathematical benefit of repeating a DNA test with a probabilistic outcome.  It will be interesting to see if this “second test” is indeed conducted again, and if so, what will be the result.

But my focus here is on another chapter in the book discussing the murder of Diana Sylvester in 1972, and the case against John Puckett, identified in 2003, over three decades later, by “cold hit” DNA testing of sperm from Diana’s body.  By “cold hit” is meant that the DNA was compared with all of the samples in a large database of “thousands of known California criminals,” without any prior suspicion of any particular person in that database in relation to this particular crime.  Puckett came up as a match, and was eventually convicted of first-degree murder and sentenced to life in prison.

The interesting mathematical question is, what is the probability that Puckett was in fact innocent of the crime, and was instead merely an unfortunate lottery winner?  That is, even if no one in the searched database committed the crime in question, what is the probability that someone in the database would be identified as a match by chance?

To make the problem slightly more disturbing, imagine a not-so-distant future where, instead of a database confined to known criminals in California, we expand the database to all of the 300+ million mostly law-abiding people in the United States.  Now what is the probability of such a “cold hit” occurring due to chance, as opposed to guilt?

This case has generated a lot of controversy and debate.  For example, mathematician Keith Devlin has written several MAA columns questioning the practice of cold hit DNA profiling (see references below), citing a 2005 study of an Arizona criminal database containing just 65,493 entries, in which a seemingly surprisingly large number of pairs of entries were found that matched at nine or more gene loci:

“A study of the Arizona CODIS database carried out in 2005 showed that approximately 1 in every 228 profiles in the database matched another profile in the database at nine or more loci, that approximately 1 in every 1,489 profiles matched at 10 loci, 1 in 16,374 profiles matched at 11 loci, and 1 in 32,747 matched at 12 loci.”

In my opinion, this is a rather misleading, or at least confusing, way of presenting what was actually found in the study.  In Math on Trial, the authors also mention this study, and I think they do a better job of more clearly describing the matches that were found, and attempting to explain them.  (Although I admit that I disagree with some of their conclusions and arguments.)  Essentially, the heart of the issue is the Birthday Problem, and whether such random matches in large databases are a symptom of a dangerously flawed law enforcement practice, or whether they are an expected (although perhaps unintuitive) but irrelevant mathematical anomaly.

References:

1. Schneps, L. and Colmez, C., Math on Trial: How Numbers Get Used and Abused in the Courtroom, New York: Basic Books, 2013.
2. Devlin, K., Devlin’s Angle, “Damned lies” (blog post), October 2006. [HTML]
3. Devlin, K. Devlin’s Angle, “DNA math and the end of innocence” (blog post), January 2007. [HTML]
4. Devlin, K. Devlin’s Angle, “Bad Math, Bad Thinking: the BMI and DNA Identification Revisited” (blog post), February 2011. [HTML]

## Efficiency of card counting in blackjack (Part 3)

Introduction

This is the third and last of a series of posts on card counting in blackjack.  In Part 1, we started with the simplest reasonable “basic” playing strategy, in which decisions to stand, hit, double down, etc., are based solely on the player’s current hand total and the dealer’s up card.  (For example, always hit hard 16 against a dealer ten.)  Yet even using this fixed playing strategy, a player’s expected return can vary significantly from round to round, since the dealer deals multiple rounds from the same shoe before reshuffling.

In Part 2, we described how to estimate this varying expected return using the true count, calculated– in your head, at the table– as a linear combination of the probabilities of card ranks remaining in the current depleted shoe.  The true count dictates betting strategy, betting more on rounds estimated to have a favorable advantage for the player.

But we can also use the true count to vary playing strategy.  My objective in this post is two-fold.  First, I will describe the best possible playing strategy, and the maximum gain in expected return that can possibly be achieved by employing such a strategy.  Second, I will describe the more realistic indexed playing strategies that use a true count, and measure their efficiency, i.e., how close they come to realizing that best possible gain in expected return.

The perfect card counter

What does “perfect play” mean?  In the context of this discussion, I mean the playing strategy that maximizes the expected return for each round, assuming perfect knowledge of the distribution of card ranks remaining in each corresponding depleted shoe.  In other words, what if you could bring a laptop with you to the table?

Prior to each round, there are two interesting expected values to consider, that are essentially the endpoints of the spectrum of possible performance by a blackjack player.  First, at the low end, there is the expected value $v_{BASIC}$ assuming that the player uses fixed, total-dependent basic strategy (as described in Part 1).  At the high end, there is the expected value $v_{OPT}$ assuming that the player instead plays perfectly, assuming knowledge of exactly how many cards of each rank remain in the current shoe.

The following figure shows the difference between these two.  That is, how much can the basic strategy player possibly expect to gain by varying playing strategy?  Each gray point represents one simulated round of play.  The x-coordinate of each point indicates the corresponding $v_{BASIC}$; the y-coordinate indicates $v_{OPT} - v_{BASIC}$.  As before, the scatterplot is overlaid with a smoothed histogram to indicate the greater density of points near the origin.

Expected gain from using optimal strategy, vs. expected return from fixed, total-dependent basic strategy.

What is the overall per-round expected return for these two “endpoint” strategies?  The basic strategy player’s expected return is about -0.4239% (note that this value is less than the “full-shoe” expected return quoted in Part 1 due to the cut card effect), while perfect play yields an expected return of only -0.2333%.  In other words, even equipped with a laptop at the table, the house still has an advantage!  This is not as surprising as it sounds; since we are focusing on playing efficiency, we are assuming flat betting.  This merely emphasizes the point that, in shoe games, accurate betting strategy is more important than varying playing strategy.

The figure above essentially shows the “distance” between a basic strategy player and a perfect player.  The performance of any actual card counting system, no matter how simple or complex, will lie somewhere in between these two extremes.  If we define the playing efficiency of basic strategy to be zero, and the playing efficiency of perfect play to be one, then the efficiency $PE$ of any other strategy is calculated using its per-round expected return $v$ according to

$PE(v) = \frac{v + 0.004239}{0.001906}$

It now remains only to compute this expected return for some actual card counting strategies of interest, and evaluate their corresponding efficiencies.

(A word of caution: before anyone runs off quoting this as “the” formula for playing efficiency, note that these particular constants depend on all of the rule variations, number of decks, and penetration assumed at the outset of this discussion.)

True count indices

The latest additions to my blackjack analysis software allow exact evaluation of indexed playing strategies that vary based on the true count.  For example, the most common refinement of basic playing strategy is to hit hard 16 against a dealer ten… unless the true count is zero or greater, in which case you should stand.  A more complex example is soft 18 vs. dealer 2.  Basic strategy in this situation is to stand, but a more complex index strategy is to hit if the true count is less than -17, stand if it’s less than 1 (but at least -17), otherwise double down.

More generally, we can specify an arbitrarily complex indexed playing strategy as a list of “exceptions” to total-dependent basic strategy.  Each exception is identified by a tuple $(h, u, d, p, r)$, where

• $h$ is the player’s hand total, with a negative value indicating a soft hand.
• $u$ is the dealer’s up card.
• $d$ is 1 if the player is allowed to double down on the hand, otherwise 0.
• $p$ is 1 if the player is allowed to split the pair hand, otherwise 0.
• $r$ is 1 if the player is allowed to surrender the hand, otherwise 0.

For each of these situations, the indexed playing strategy is given by a partition of the real line into half-open intervals of possible true counts, where each interval corresponds to a particular playing decision, encoded as 1=stand, 2=hit, 3=double down, 4=split, or 5=surrender.

For example, following are the so-called “Illustrious 18″ index plays using the Hi-Lo true count.  Compare this machine-readable format with the original list generated by Cacarulo at bjmath.com.  Note how the playing decisions are interleaved with the true count indices indicating the endpoints of the corresponding intervals, with +1000 acting as “positive infinity.”

# Hi-Lo Illustrious 18 Revisited (Cacarulo)
# cnt up  dbl spl sur p1 tc1 p2 ... +1000
+16 10   0   0   0  2   0  1      +1000
+16 10   1   0   0  2   0  1      +1000
+12  3   0   0   0  2  +2  1      +1000
+12  3   1   0   0  2  +2  1      +1000
+15 10   0   0   0  2  +4  1      +1000
+15 10   1   0   0  2  +4  1      +1000
+11  1   1   0   0  2  +1  3      +1000
+12  2   0   0   0  2  +3  1      +1000
+12  2   1   0   0  2  +3  1      +1000
+9  2   1   0   0  2  +1  3      +1000
+20  5   0   1   0  1  +5  4      +1000
+20  5   1   1   0  1  +5  4      +1000
+20  6   0   1   0  1  +4  4      +1000
+20  6   1   1   0  1  +4  4      +1000
+8  6   1   0   0  2  +2  3      +1000
+16  9   0   0   0  2  +4  1      +1000
+16  9   1   0   0  2  +4  1      +1000
-19  6   1   0   0  1  +1  3      +1000
+12  4   0   0   0  2   0  1      +1000
+12  4   1   0   0  2   0  1      +1000
-19  5   1   0   0  1  +1  3      +1000
+13  2   0   0   0  2  -1  1      +1000
+13  2   1   0   0  2  -1  1      +1000
+10  1   1   0   0  2  +3  3      +1000
+10  1   1   1   0  2  +3  3      +1000
+8  5   1   0   0  2  +4  3      +1000
+9  7   1   0   0  2  +3  3      +1000


In addition to the Hi-Lo Illustrious 18, I also generated full sets of indices for the Hi-Lo and Hi-Opt II counts, using CVIndex, part of the Casino Vérité suite of blackjack analysis software.

The new analysis capability that I am excited about– that motivated this series of posts– is the ability to quickly compute the exact expected return for any given subset of cards in a depleted shoe, using an indexed playing strategy as specified above.

Efficiency results

The following figures show the distribution of gain in expected return, similar to the figure above, for three different and progressively more complex card counting systems:

• Hi-Lo Illustrious 18 (as revised by Cacarulo)
• Hi-Lo with full indices
• Hi-Opt II with full indices

For easier comparison of improvement in performance, each figure has the same axis limits as the “baseline” figure above.

Hi-Lo Illustrious 18 (revisited)

Expected gain from using optimal strategy, vs. expected return using Hi-Lo Illustrious 18 indices.

Hi-Lo full indices

Expected gain from using optimal strategy, vs. expected return using full Hi-Lo indices.

Hi-Opt II full indices

Expected gain from using optimal strategy, vs. expected return using full Hi-Opt II indices.

Finally, we can compute the corresponding playing efficiencies:

• Hi-Lo Illustrious 18 has playing efficiency PE = 0.309.
• Hi-Lo with full indices has PE = 0.470.
• Hi-Opt II with full indices has PE = 0.639.

I think this analysis raises as many questions as it answers.  For example, these more accurate calculations of playing efficiency are lower than the approximations given by Griffin (see Chapter 4 in the reference below).  There are several possible reasons for the difference: is the approximation inherently biased, or is it simply due to different assumed number of decks, penetration, etc.?

References:

1. Griffin, Peter A., The Theory of Blackjack, 6th ed. Las Vegas: Huntington Press, 1999.

## Password protection of worksheets in Excel

This is a detour from the last couple of posts; I will pick up the blackjack thread again later.  I was distracted recently by a desire to “unprotect” a password-protected Excel spreadsheet in order to view some hidden cell contents.  It turns out this is not a new problem, but it is an interesting exercise, with an opportunity to improve on what seems to be the “standard” solution.

OpenOffice.org (PDF, see Section 4.18.4) provides a description of the password protection algorithm, which is not very strong.  (This is not surprising, since it’s not intended to prevent disclosure.  As the help documentation points out, it’s really just a safety net for editing, “to help prevent others from changing, moving, or deleting important data.”)  When you enter a password to protect a worksheet, Excel does not actually store the password itself, only a 16-bit hash of the password.  When you subsequently enter the password to attempt to unprotect the worksheet, Excel simply compares the hash of the password you just entered with the stored hash value.

There are about 10^30 different possible passwords with a maximum of 15 characters.  But there are only 32,768 different possible hash values, so collisions should be easy to come by.  That is, we don’t have to recover the original password to unprotect the worksheet, we only have to enter a password with the same hash value.  We can brute-force attack the problem by trying passwords with all possible hash values.

The hash function is simple to describe: compute the bit-wise exclusive-OR of (1) the constant 0xCE4B, (2) the password length, and (3) the ASCII value of each character of the password, as a 15-bit value rotated left bit-wise by its position.  That is, rotate the first character 1 bit left, the second character 2 bits left, etc.  The following Python code implements this function:

def excel_hash(password):
"""Return hash of given string, in the range [0x8000, 0xffff]."""
h = 0
h = h ^ ord(ch)
h = ((h << 1) & 0x7fff) | (h >> 14)
return h ^ len(password) ^ 0xce4b


There is a solution that has been posted and re-posted many times, that essentially tries all 12-character passwords where each of the first 11 characters are A or B (and the 12th is any printable ASCII).  This is guaranteed to yield all 32,768 possible hash values.  It works.  So that should be the end of it… except I’m not sure why this particular enumeration of passwords was chosen, since it isn’t clear to me that it “covers” all possible hash values in any structured way.

First, it is more exhaustive than it needs to be.  That is, I thought perhaps this approach might have been the minimal result of a simple blind empirical search, trying longer and longer passwords of the form “[A|B]*.” (with the last printable character to “fill in the gaps”) until all possible hash values were covered.  But this isn’t the case, since we can make the solution 4 times faster by using just 10-character passwords– i.e., where the first 9 are A or B, and the 10th is any printable character.

Second, why use the characters A and B?  The ASCII values for A and B are 65 and 66, respectively, meaning that we are toggling two consecutive bits with each character, not just one, and those pairs of changing bits for consecutive password characters overlap in position, which doesn’t seem helpful.

My first thought was, “The dumbest thing that could possibly work is to use 15-character passwords, where each character is, say, either B or C.”  This way, each character toggles exactly one distinct bit in the hash, so that we try exactly one password for each possible hash value.  And indeed, this approach does work, and is about 6 times faster than the commonly-cited solution.

But instead of a 15-level nested for-loop, we can shorten the code somewhat by just varying three characters in the password, with each character exploring 5 bits of the hash.  The only trick is (1) keeping the characters’ ASCII values in the printable range, and (2) “rotating” those three characters so that their blocks of 5 bits do not overlap, by inserting constant password characters between them.

The result is the following code, with instructions on how to use it:

1. Open the worksheet that you want to unprotect.
2. Press Alt-F11 to open Visual Basic.
3. Press F7 to open an editor window.
4. Copy/paste the code below into the editor window.
Sub FindPassword()
Dim c1 As Integer, c2 As Integer, c3 As Integer
Dim passwd As String
On Error Resume Next
For c1 = 64 To 95
For c2 = 64 To 95
For c3 = 64 To 95
passwd = Chr(c1) & "...." & Chr(c2) & "...." & Chr(c3)
ActiveSheet.Unprotect passwd
If ActiveSheet.ProtectContents = False Then
MsgBox "Unprotected using password: '" & passwd & "'"
Exit Sub
End If
Next
Next
Next
End Sub


5. Finally, press F5 to run.  A dialog is displayed indicating a valid password.  The worksheet is now unprotected.

Posted in Uncategorized | 4 Comments

## Efficiency of card counting in blackjack (Part 2)

Introduction

Continuing from last time, recall that card counting systems may be used in two different ways: to vary betting strategy, betting more or less on each round based on an estimate of the current expected return; but possibly also to vary playing strategy, affecting decisions to stand, hit, etc., based on estimates of favorability of the various options.

My objective in this still-introductory post is to focus on the first of these two roles, describing how typical card counting systems work, and how they are used to estimate expected return.  Next time, I will finally get to the new stuff, dealing with playing strategy “indices,” and the concept of “efficiency” as a measure of how close various card counting systems are to being optimal.

The True Count

Given a playing strategy (which we are currently assuming is the fixed, total-dependent basic strategy from the last post), we can compute the exact expected return for a round played with that strategy, as a function of the shoe composition prior to the deal.  A shoe composition is specified by a vector s indicating the number of cards of each rank ace through ten.  For example, a full n-deck shoe is given by

$\mathbf{s} = n \mathbf{d}$, where $\mathbf{d} = (4,4,4,4,4,4,4,4,4,16)$

The actual expected return is a complicated non-linear function of s.  A typical card counting system estimates this expected return using the true count (TC), a simpler linear function that a player can compute in his head: basically, the true count is a weighted average of the probabilities of card ranks remaining in the shoe.

To make this precise, let t be a vector of “tags,” or weights, associated with each card rank ace through ten.  Different systems use different tags; for example, the very common Hi-Lo system, first described by Harvey Dubner 50 years ago, uses the tags

$\mathbf{t} = (-1,1,1,1,1,1,0,0,0,-1)$

Then the true count for a given shoe composition is defined to be

$TC_\mathbf{t}(\mathbf{s}) = \frac{-\mathbf{t} \cdot \mathbf{s}}{n(\mathbf{s})}$

where, if we temporarily define n(s) to be the total number of cards remaining in the shoe, we see that the true count is just a sum of the card probabilities, weighted by -t.

We need to revise this definition slightly, though, to reflect how the true count is actually computed at the table.  First, the numerator in the formula is the running count (RC); the reader can verify that we can mentally maintain the running count throughout a shoe by adding the tag $t_i$ for each card of rank i that we see dealt, starting with an initial running count (IRC) of $-\mathbf{t}\cdot(n\mathbf{d})$ for a full n-deck shoe.  (Note that the IRC for the Hi-Lo system above is conveniently equal to zero for any number of decks; more on this later.)

Whenever we need to compute the true count, we simply divide the running count by n(s).  The problem is that it is difficult to keep track of exactly how many cards remain in the shoe.  The solution in most card counting systems is to instead divide by the number of decks remaining (i.e., blocks of 52 cards), estimated with some coarser resolution.  For example, if we estimate the number of decks by rounding to the nearest half-deck, then the true count divisor is

$n(\mathbf{s}) = \frac{1}{2} \left \lfloor \frac{\sum s_i}{26} + \frac{1}{2} \right \rfloor$

I will use this definition of the true count divisor for the rest of this discussion.  At this point, I think it is important to note two effects of this change.  First, we have relaxed the amount of mental effort required, by approximating the number of cards remaining in the shoe.  But a second effect, perhaps more important but rarely made explicit, is that we have effectively introduced a scale factor, multiplying all of our previously computed true counts by 52.  This is also helpful for the human player, since the resulting true counts range over a wider interval and may be approximated by integers, instead of being confined to small fractional values typically in the interval (-1, 1).

Accuracy of true count estimation of expected return

So how well does this work?  The following figure shows the actual expected return– still using fixed total-dependent basic strategy– vs. the Hi-Lo true count, for each of 4.3 million rounds (gray points) played over 100,000 shoes.  Here again, the color is an overlaid smoothed histogram to show the greater density of points near the origin.

Expected return vs. Hi-Lo true count, using fixed total-dependent basic strategy.

The correlation coefficient is 0.953; for example, if the true count were a perfect predictor of expected return, these points would all lie exactly on a straight line, with a correlation of 1.0.

Can we do better than this?  It turns out that we can… but only if we relax some constraints on the counting systems we are allowed to use.

First, recall that the initial running count (IRC) for the Hi-Lo system is zero.  Such systems are called “balanced”; systems with a non-zero IRC are called “unbalanced.”  It is not clear to me how useful this distinction is, or in particular why unbalanced counts are often advertised as “not requiring a true count conversion,” as if the definition of the true count above depends in any way on whether the IRC happens to be zero or not.

At any rate, if we expand our space of possible counting systems to include unbalanced counts, then the Knockout, or K-O system, which counts sevens as +1, has a slightly higher correlation of 0.955.

However, these two systems– one balanced, one unbalanced– are only optimal among “Level 1″ systems, i.e., systems with tags in {-1, 0, +1}.  If we consider “Level 2″ systems with tags chosen from {-2, -1, 0, +1, +2}, then the balanced count with the highest correlation with basic strategy expected return is $(-2, 2, 2, 2, 2, 2, 1, 0, -1, -2)$, with a correlation of 0.967.  The best overall Level 2 count is the unbalanced $(-2, 1, 2, 2, 2, 2, 1, 0, -1, -2)$, with a correlation of 0.973.

Betting isn’t everything

There is a reason why you have probably never heard of or seen these two Level 2 counts in the wild.  Note that the optimality of these counting systems depends on the specific rule variations, penetration, and fixed playing strategy assumed so far.  Also, these systems are only “optimal” in the sense of maximizing correlation between true count and actual expected return.  This is still an imperfect measure of “betting efficiency,” which we have yet to define precisely.

But before diving deeper into betting efficiency, we are now in a position to address playing efficiency… where these systems generally suffer.  So far, the playing strategy has been fixed, total-dependent “basic” strategy, depending only on the cards in the player’s current hand (and the dealer’s up card).  For example, basic strategy is to always hit hard 16 against a dealer 10.  But we can improve performance by allowing playing strategy to vary based not only on the cards in the current hand, but also on the current true count.

Next time, I will describe how this is done, including new software for evaluating the resulting improvement in expected return, compared with the best possible improvement from perfect play.

Posted in Uncategorized | 1 Comment