# How GoTo Telescopes Work

The latest issue of The College Mathematics Journal contains a good article by Donald Teets, titled “Push-To Telescope Mathematics” (see references below), describing the alignment process used by computerized telescopes to determine their orientation, and thus how to point to selected stars.  This is a beautiful problem, in part because of just how little information is needed for its solution.  Just power up the telescope, point it at any two known stars, and after some simple computation, the telescope knows how to find any other star in its database.

But some care must be taken; I think the solution in the article overlooks an important practical consideration, which I’ll get to shortly.  Fortunately, there is an equally beautiful solution to a natural generalization of this problem, that I suspect is probably closer to what is actually implemented in the software of these computerized telescopes.

The basic problem is this: when the telescope is first powered up, it does not know its orientation relative to the stars.  That is, it does not know the relationship between its “body frame” azimuth and altitude (which it can sense and/or control with motors) and the corresponding “reference frame” right ascension and declination.

We can point the telescope at a known star $S_1$, and tell the telescope that it is pointing at that star.  This gives some information about the telescope’s orientation: a particular direction in the body frame corresponds to a known direction in the reference frame.  From now on, no matter where the telescope slews, it always knows how to return to $S_1$. [1]

However, knowing the direction to just one star is not enough to completely determine orientation.  Intuitively, this makes sense: for example, when the telescope is pointed at $S_1$, it may still be “rolled” through any angle about its axis without changing its direction.  Slewing the telescope again to point at one more known star $S_2$ will do the trick.

(Actually, this problem is interesting in that one measurement is not enough, but two is too many.  Three parameters are needed to specify the telescope’s orientation– think yaw, pitch, and roll– and each star direction measurement provides two parameters.  So with two stars, the telescope’s orientation is over-determined.  We’ll come back to this shortly.)

At this point, we have the following mathematical problem: given 3-D unit (column) vectors $\mathbf{r}_i$ and $\mathbf{s}_i$ indicating the position of star $S_i$ in the inertial reference frame and sensor body frame, respectively, what is the 3×3 rotation matrix $A$ such that $A \mathbf{s}_i = \mathbf{r}_i$?

The CMJ article solves the problem as follows: let $R$ be the 3×3 matrix with columns given by

$R = (\mathbf{r}_1, \mathbf{r}_2, \mathbf{r}_1 \times \mathbf{r}_2)$

and $S$ be the 3×3 matrix

$S = (\mathbf{s}_1, \mathbf{s}_2, \mathbf{s}_1 \times \mathbf{s}_2)$

Then $A S = R$; solving, we have $A = R S^{-1}$.  Simple, right?

This works fine… as long as our measurements of direction in the body and/or reference frames are exact.  If there are any errors– and there almost certainly are– then in general the matrix $A$ may not even represent a proper rotation (i.e., it may not be orthogonal with determinant 1).

Fortunately, there are two well-documented incremental improvements to this approach.  The first is the TRIAD algorithm, whose only difference is essentially to ensure that the input matrices are themselves rotations, even if the measurements are inaccurate:

$R = (\mathbf{r}_1, \frac{\mathbf{r}_1 \times \mathbf{r}_2}{|\mathbf{r}_1 \times \mathbf{r}_2|}, \mathbf{r}_1 \times (\frac{\mathbf{r}_1 \times \mathbf{r}_2}{|\mathbf{r}_1 \times \mathbf{r}_2|}))$

$S = (\mathbf{s}_1, \frac{\mathbf{s}_1 \times \mathbf{s}_2}{|\mathbf{s}_1 \times \mathbf{s}_2|}, \mathbf{s}_1 \times (\frac{\mathbf{s}_1 \times \mathbf{s}_2}{|\mathbf{s}_1 \times \mathbf{s}_2|}))$

$A = R S^T$

Note that the TRIAD algorithm effectively treats the first measurement as “perfect” (i.e., $A \mathbf{s}_1 = \mathbf{r}_1$ exactly), and accounts for error in the second measurement.

A more realistic approach would be to assume that all of the measurements contain errors, and to try to determine the “best” rotation that minimizes the effect of those errors in a least-squares sense.  It turns out that we can do just that, with a very elegant algorithm described by Markley (see references below) that looks only slightly more sophisticated than what we have already seen.  As a bonus, this algorithm can accept more than just two measurements:

Given n measurements of unit vector pairs in the reference and body frames, let $R$ be the 3xn matrix with columns corresponding to the reference frame vectors, and $S$ be the 3xn matrix of body frame vectors.  Then the rotation matrix $A$ is computed using the singular value decomposition of $R S^T$:

$U \Sigma V^T = R S^T$

$A = U diag(1, 1, \det(U) \det(V)) V^T$

Finally, it is worth emphasizing that this is really just one of several algorithms that solve the same least-squares problem.  Shuster provides a good history and survey of these algorithms in the paper referenced below, including his quaternion-based QUEST algorithm.

References:

1. Markley, F. L., Attitude Determination using Vector Observations and the Singular Value Decomposition, The Journal of the Astronautical Sciences, 36(3) (July-September 1988): 245-258. [PDF]
2. Shuster, M. D., The Quest for Better Attitudes, The Journal of the Astronautical Sciences, 54(3-4) (July-December 2006): 657–683. [PDF]
3. Teets, D., Push-To Telescope Mathematics. The College Mathematics Journal43(3) (May 2012): 227-231. [JSTOR]

[1] This is not quite true.  I am glossing over the fact that the telescope “body” frame is rotating with the Earth.  So after some time has passed, the telescope does not even have enough information to know how to rotate back to the star $S_1$.  But as will be shown, we can handle this problem, with a second star… and an onboard clock to measure the time difference between the two star measurements.

# The Hunger Games: Follow-up

This is a follow-up to the previous post, which presented a situation in the second book of the Hunger Games trilogy as a probability problem.  The problem may be re-stated as follows: when $n=59$ balls are randomly distributed into $d=24$ initially empty urns, what is the probability that every urn gets at least one ball?  (In the book, living past Hunger Games victors are the balls, and the districts/genders of the victors are the urns.  The problem asks how likely is it that a male and female from each of 12 districts is represented.)

I found that students really enjoyed tackling this problem, and it can lead in many interesting directions.  It makes for a very accessible programming exercise to estimate the probability by simulation.  But there is also a nice analytic solution, particularly when each urn is equally likely.  In that case, we can use inclusion-exclusion to count distributions of the balls leaving $k$ of the urns empty, yielding the probability:

$\sum_{k=0}^d (-1)^k {d \choose k} (\frac{d-k}{d})^n$

The following plot shows this probability as a function of $n$, the number of balls (i.e., past victors).  For $n=59$, the probability is only about 0.099, or less than 1 in 10.

Probability that all 12 districts have a living male and female past victor, vs. the number of living victors.

Finally, this problem is essentially a slight variant of the coupon collector problem, where in the latter we are asked for the expected number of balls needed so that every urn gets at least one ball.  There is plenty of interesting literature on the problem, including the case where the probabilities for each urn are not equal.

Which brings us back to the original episode in the book, where those probabilities are indeed not equal; 3 of the 12 districts are described as favorites for producing victors.  This makes for another nice simulation exercise, where it is necessary to quantify what “favorite” might mean.  Another interesting challenge is to show that this always makes the scenario even less likely.  That is, the probability of “hitting” all of the urns is maximized when the probabilities for the urns are equal.