A number concatenation problem

Introduction

Consider the following problem: given a finite set of positive integers, arrange them so that the concatenation of their decimal representations is as small as possible.  For example, given the numbers {1, 10, 12}, the arrangement (10, 1, 12) yields the minimum possible value 10112.

I saw a variant of this problem in a recent Reddit post, where it was presented as an “easy” programming challenge, referring in turn to a blog post by Santiago Valdarrama describing it as one of “Five programming problems every software engineer should be able to solve in 1 hour.”

I think the problem is interesting in that it seems simple and intuitive– and indeed it does have a solution with a relatively straightforward implementation– but there are also several “intuitive” approaches that don’t work… and even for the correct implementation, there is some subtlety involved in proving that it really works.

Brute force

First, the following Python 3 implementation simply tries all possible arrangements, returning the lexicographically smallest:

import itertools

def min_cat(num_strs):
    return min(''.join(s) for s in itertools.permutations(num_strs))

(Aside: for convenience in the following discussion, all inputs are assumed to be a list of strings of decimal representations of positive integers, rather than the integers themselves.  This lends some brevity to the code, without adding to or detracting from the complexity of the algorithms.)

This implementation works because every concatenated arrangement has the same length, so a lexicographic comparison is equivalent to comparing the corresponding numeric values.  It’s unacceptably inefficient, though, since we have to consider all n! possible arrangements of n inputs.

Sorting

We can do better, by sorting the inputs in non-decreasing order, and concatenating the result.  But this is where the problem gets tricky: what order relation should we use?

We can’t just use the natural ordering on the integers; using the same earlier example, the sorted arrangement (1, 10, 12) yields 11012, which is larger than the minimum 10112.  Similarly, the sorted arrangement (2, 11) yields 211, which is larger than the minimum 112.

We can’t use the natural lexicographic ordering on strings, either; the initial example (1, 10, 12) fails again here.

The complexity arises because the numbers in a given input set may have different lengths, i.e. numbers of digits.  If all of the numbers were guaranteed to have the same number of digits, then the numeric and lexicographic orderings are the same, and both yield the correct solution.  Several users in the Reddit thread, and even Valdarrama, propose “padding” each input in various ways before sorting to address this, but this is also tricky to get right.  For example, how should the inputs {12, 121} be padded so that a natural lexicographic ordering yields the correct minimum value 12112?

There is a way to do this, which I’ll leave as an exercise for the reader.  Instead, consider the following solution (still Python 3):

import functools

def cmp(x, y):
    return int(x + y) - int(y + x)

def min_cat(num_strs):
    return ''.join(sorted(num_strs, key=functools.cmp_to_key(cmp)))

There are several interesting things going on here.  First, a Python-specific wrinkle: we need to specify the order relation \prec by which to sort.  This actually would have looked slightly simpler in the older Python 2.7, where you could specify a binary comparison function directly.  In Python 3, you can only provide a unary key function to apply to each element in the list, and sort by that.  It’s an interesting exercise in itself to work out how to “convert” a comparison function into the corresponding key function; here we lean on the built-in functools.cmp_to_key to do it for us.  (This idea of specifying an order relation by a natural comparison without a corresponding natural key has been discussed here before, in the context of Reddit’s comment ranking algorithm.)

Second, recall that the input num_strs is a list of strings, not integers, so in the implementation of the comparison cmp(x, y) , the arguments are strings, and the addition operators are concatenation.  The comparison function returns a negative value if the concatenation xy, interpreted as an integer, is less than yx, zero if they are equal, or a positive value if xy is greater than yx.  The intended effect is to sort according to the relation x \prec y defined as xy < yx.

It works… but should it?

This implementation has a nice intuitive justification: suppose that the entire input list contained just two strings x and y.  Then the comparison function effectively realizes the “brute force” evaluation of the two possible arrangements xy and yx.

However, that same intuitive reasoning becomes dangerous as soon as we consider input lists with more than two elements.  That comparison function should bother us, for several reasons:

First, it’s not obvious that the resulting sorted ordering is even well-defined.  That is, is the order relation \prec a strict weak ordering of the set of (decimal string representations of) positive integers?  It certainly isn’t a total ordering, since distinct values can compare as “equal:” for example, consider (1, 11), or (123, 123123), etc.

Second, even assuming the comparison function does realize a strict weak ordering (we’ll prove this shortly), that ordering has some interesting properties.  For example, unlike the natural ordering on the positive integers, there is no smallest element.  That is, for any x, we can always find another strictly lesser y \prec x (as a simple example, note that x0 \prec x, e.g., 1230 \prec 123).  Also unlike the natural ordering on the positive integers, this ordering is dense; given any pair x \prec y, we can always find a third value z in between, i.e., x \prec z \prec y.

Finally, and perhaps most disturbingly, observe that a swap-based sorting algorithm will not necessarily make “monotonic” progress toward the solution: swapping elements that are “out of order” in terms of the comparison function may not always improve the overall situation.  For example, consider the partially-sorted list (12, 345, 1), whose concatenation yields 123451.  The comparison function indicates that 12 and 1 are “out of order” (121>112), but swapping them makes things worse: the concatenation of (1, 345, 12) yields the larger value 134512.

Proof of correctness

Given all of this perhaps non-intuitive weirdness, it seems worth being more rigorous in proving that the above implementation actually does work.  We do this in two steps:

Theorem 1: The relation \prec defined by the comparison function cmp is a strict weak ordering.

Proof: Irreflexivity follows from the definition.  To show transitivity, let x, y, z be positive integers with a, b, c digits, respectively, with x \prec y and y \prec z.  Then

10^b x+y < 10^a y+x and 10^c y+z < 10^b z+y

Thus,

x(10^c-1) < x \frac{z}{y}(10^b-1) < z(10^a-1)

10^c x-x < 10^a z-z

10^c x+z < 10^a z+x

i.e., x \prec z.  Incomparability of x and y corresponds to xy=yx; this is an equivalence relation, with reflexivity and symmetry following from the definition, and transitivity shown exactly as above (with equality in place of inequality).

Theorem 2: Concatenating positive integers sorted by \prec yields the minimum value among all possible arrangements.

Proof: Let x_1 x_2 \ldots x_n be the concatenation of an arrangement of positive integers with minimum value, and suppose that it is not ordered by \prec, i.e., x_i \succ x_{i+1} for some 1 \leq i < n.  Then the concatenation x_1 x_2 \ldots x_{i+1} x_i \ldots x_n is strictly smaller, a contradiction.

(Note that this argument is only “simple” because x_i and x_{i+1} are adjacent.  As mentioned above, swapping non-adjacent elements that are out of order may not in general decrease the overall value.)

Cutting crown molding

This post captures my notes on how to determine the miter and bevel angles for cutting crown molding with a compound miter saw.  There are plenty of web sites with tables of these angles, and even formulas for calculating them, but I thought it would be useful to be more explicit about sign conventions, orientation of the finished cut pieces, etc., as well as to provide a slightly different version of the formulas that doesn’t involve a discontinuity right in the middle of the region of interest, as typically seems to be the case.

Instructions

Let s be the measure of the spring angle, i.e., the angle made by the flat back side of the crown molding with the wall (typically 38 or 45 degrees).  Let w be the measure of the wall angle (e.g., 90 degrees for an inside corner, 270 degrees for an outside corner, etc.).

To cut the piece on the left-hand wall (facing the corner), set the bevel angle b and miter angle m to

b = \arcsin(-\cos\frac{w}{2}\cos s)

m = \arcsin(-\tan b \tan s)

where positive angles are to the right (i.e., positive miter angle is counter-clockwise).  Cut with the ceiling contact edge against the fence, and the finished piece on the left side of the blade.

To cut the piece on the right-hand wall (facing the corner), reverse the miter angle,

m' = -m = \arcsin(\tan b \tan s)

and cut with the wall contact edge against the fence, and the finished piece still on the left side of the blade.

Derivation

Let’s start by focusing on the crown molding piece on the left-hand wall as we face the corner.  Consider a coordinate frame with the ceiling corner at the origin, the positive x-axis running along the crown molding to be cut, the negative z-axis running down to the floor, and the y-axis completing the right-handed frame, as shown in the figure below.  In this example of an inside 90-degree corner, the positive y-axis runs along the opposite wall.

Cutting crown molding for left-hand wall. Example shows an inside corner (w=90 degrees).

The desired axis of rotation of the saw blade is normal to the triangular cross section at the corner, which may be computed as the cross product of unit vectors from the origin to the vertices of this cross section:

\mathbf{u} = (0, 0, -1) \times (\cos\frac{w}{2}, \sin\frac{w}{2}, 0)

To cut with the back of the crown molding flat on the saw table (the xz-plane), with the ceiling contact edge against the fence (the xy-plane), rotate this vector by angle s about the x-axis:

\mathbf{v} = \left(\begin{array}{ccc}1&0&0\\0&\cos s&-\sin s\\0&\sin s&\cos s\end{array}\right) \mathbf{u}

It remains to compute the bevel and miter rotations that transform the axis of rotation of the saw blade from its initial (1,0,0) to \mathbf{v}.  With the finished piece on the left side of the blade, the bevel is a rotation by angle b about the z-axis, followed by the miter rotation by angle m about the y-axis:

\left(\begin{array}{ccc}\cos m&0&\sin m\\0&1&0\\-\sin m&0&\cos m\end{array}\right) \left(\begin{array}{ccc}\cos b&-\sin b&0\\ \sin b&\cos b&0\\0&0&1\end{array}\right) \left(\begin{array}{c}1\\0\\0\end{array}\right) = \mathbf{v}

Solving yields the bevel and miter angles above.  For the crown molding piece on the right-hand wall, we can simply change the sign of both s and w, assuming that the wall contact edge is against the fence (still with the finished piece on the left side of the blade).  The result is no change to the bevel angle, and a sign change in the miter angle.