The MATLAB colon operator is surprisingly complicated, given that its job seems pretty simple to describe: generate a vector of regularly-spaced values, with a specified starting point, step size, and endpoint. For example, to create the vector :
x = 0:0.1:1.2;
At least some complexity is understandable, since as in this example, the “intended” step size and/or the endpoints may not be represented exactly in double floating-point precision. But in MATLAB’s usual habit of trying to “helpfully” account for this, things get messier than they need to be. The motivation for this post is to describe two different behaviors of the colon operator: it behaves in one special way in for loops, and in a different way– well, everywhere else.
Creating vectors with colon syntax
First, the “everywhere else” case: as the documentation suggests,
The vector elements are roughly equal to
[start, start+step, start+2*step, ...]… however, if [the step size] is not an integer, then floating point arithmetic plays a role in determining whether
colonincludes the endpoint in the vector.
That is, continuing the above example, note that
ismember(1.2, x), despite the fact that
0+12*0.1 > 1.2. But the actual implementation is even more complex than just computing the “intended” endpoint. The output vector is effectively constructed in two halves, adding multiples of the step size to the starting point in the first half, and subtracting multiples of the step size from the (computed) endpoint in the second half.
So far, this seems reasonably well known, despite the broad strokes documentation. There is a good description of the details of how this works on Stack Overflow. Let’s not worry about those details here, though; instead, what seems less well known is that the same colon expression, such as in the example above, behaves differently when it appears in a for loop.
For loops with (and without) colon syntax
First, it’s worth noting that MATLAB for loops don’t have to use the colon operator at all. With not-quite-full-fledged iterator-ish semantics, you can iterate over the columns of an arbitrary array expression. For some examples:
for index = [2, 3, 5, 7] disp(index); % 4 iterations end for index = x disp(index); % 13 iterations end
(Technically, iteration is over first-column “slices” of the possibly multi-dimensional array. This can cause some non-intuitive behavior. For example, how many iterations would you expect over
ones(2,0,3)? What about
But here is where things get weird. Consider the following example:
x = 0:0.1:1.2; for index = 0:0.1:1.2 disp(find(x == index)); end
This loop only “finds” 7 of the 13 elements of the original vector above, which was created using exactly the same colon operator expression!
So what’s going on? First, while the colon operator documentation was perhaps merely incomplete, the for loop documentation is downright misleading, suggesting that the behavior is to “increment
index by the
step on each iteration.” That sounds to me like repeatedly adding the step size to the value at the previous iteration, which would be even worse in terms of error accumulation, and is fortunately not what’s happening here.
Instead, experiments suggest that what is happening is essentially the overly-simplified description in the colon operator (not the for loop) documentation: the statement
for index = start:step:stop iterates over values of the form
start+k*step — i.e., adding multiples of the step size to the starting point– with the added detail that the number of iterations (i.e., the stopping point) seems to be computed in the same way as the “normal” colon operator. That is, the documentation is also wrong in that it’s not as simple as incrementing “until
index is greater than
stop” (witness the example above, where the last value is allowed to slightly overshoot the given endpoint). I have been unable to find an example of a colon expression whose size is different depending on whether it’s in a for loop.
What I find most interesting about this is how hard MathWorks has to work– and is still working— to make this confusing. That is, the colon syntax in a for statement is a special case in the parser: there are necessarily extra lines of code to (1) detect the colon syntax in a for loop, and (2) do something different than they could have done by simply always evaluating whatever arbitrary array expression– colon or otherwise– is given to the right of the equals sign.
And this isn’t just old legacy behavior that no one is paying attention to anymore. Prior to R2019b, you could “trick” the parser into skipping the special case behavior in a for loop by wrapping the colon expression in redundant array brackets:
for index = [0:0.1:1.2] disp(find(x == index)); % finds all 13 values end
However, as of R2019b, this no longer “works;” short of using the explicit function notation
colon(0,0.1,1.2), it now takes more sophisticated obfuscation on the order of
[0:0.1:1.2, ] or similar nonsense to say, “No, really, use the colon version, not the for loop version.”