MATLAB’s colon operator and for loops

Introduction

The MATLAB colon operator is surprisingly complicated, given that its job seems pretty simple to describe: generate a vector of regularly-spaced values, with a specified starting point, step size, and endpoint. For example, to create the vector x=(0, 0.1, 0.2, \ldots, 1.1, 1.2):

x = 0:0.1:1.2;

At least some complexity is understandable, since as in this example, the “intended” step size and/or the endpoints may not be represented exactly in double floating-point precision. But in MATLAB’s usual habit of trying to “helpfully” account for this, things get messier than they need to be. The motivation for this post is to describe two different behaviors of the colon operator: it behaves in one special way in for loops, and in a different way– well, everywhere else.

Creating vectors with colon syntax

First, the “everywhere else” case: as the documentation suggests,

The vector elements are roughly equal to [start, start+step, start+2*step, ...] … however, if [the step size] is not an integer, then floating point arithmetic plays a role in determining whether colon includes the endpoint in the vector.

That is, continuing the above example, note that ismember(1.2, x), despite the fact that 0+12*0.1 > 1.2. But the actual implementation is even more complex than just computing the “intended” endpoint. The output vector is effectively constructed in two halves, adding multiples of the step size to the starting point in the first half, and subtracting multiples of the step size from the (computed) endpoint in the second half.

So far, this seems reasonably well known, despite the broad strokes documentation. There is a good description of the details of how this works on Stack Overflow. Let’s not worry about those details here, though; instead, what seems less well known is that the same colon expression, such as in the example above, behaves differently when it appears in a for loop.

For loops with (and without) colon syntax

First, it’s worth noting that MATLAB for loops don’t have to use the colon operator at all. With not-quite-full-fledged iterator-ish semantics, you can iterate over the columns of an arbitrary array expression. For some examples:

for index = [2, 3, 5, 7]
    disp(index); % 4 iterations
end

for index = x
    disp(index); % 13 iterations
end

(Technically, iteration is over first-column “slices” of the possibly multi-dimensional array. This can cause some non-intuitive behavior. For example, how many iterations would you expect over ones(2,0,3)? What about ones(0,2,3)?)

But here is where things get weird. Consider the following example:

x = 0:0.1:1.2;
for index = 0:0.1:1.2
    disp(find(x == index));
end

This loop only “finds” 7 of the 13 elements of the original vector above, which was created using exactly the same colon operator expression!

So what’s going on? First, while the colon operator documentation was perhaps merely incomplete, the for loop documentation is downright misleading, suggesting that the behavior is to “increment index by the step on each iteration.” That sounds to me like repeatedly adding the step size to the value at the previous iteration, which would be even worse in terms of error accumulation, and is fortunately not what’s happening here.

Instead, experiments suggest that what is happening is essentially the overly-simplified description in the colon operator (not the for loop) documentation: the statement for index = start:step:stop iterates over values of the form start+k*step — i.e., adding multiples of the step size to the starting point– with the added detail that the number of iterations (i.e., the stopping point) seems to be computed in the same way as the “normal” colon operator. That is, the documentation is also wrong in that it’s not as simple as incrementing “until index is greater than stop” (witness the example above, where the last value is allowed to slightly overshoot the given endpoint). I have been unable to find an example of a colon expression whose size is different depending on whether it’s in a for loop.

Conclusion

What I find most interesting about this is how hard MathWorks has to work– and is still working— to make this confusing. That is, the colon syntax in a for statement is a special case in the parser: there are necessarily extra lines of code to (1) detect the colon syntax in a for loop, and (2) do something different than they could have done by simply always evaluating whatever arbitrary array expression– colon or otherwise– is given to the right of the equals sign.

And this isn’t just old legacy behavior that no one is paying attention to anymore. Prior to R2019b, you could “trick” the parser into skipping the special case behavior in a for loop by wrapping the colon expression in redundant array brackets:

for index = [0:0.1:1.2]
    disp(find(x == index)); % finds all 13 values
end

However, as of R2019b, this no longer “works;” short of using the explicit function notation colon(0,0.1,1.2), it now takes more sophisticated obfuscation on the order of [0:0.1:1.2, []] or similar nonsense to say, “No, really, use the colon version, not the for loop version.”

This entry was posted in Uncategorized. Bookmark the permalink.

1 Response to MATLAB’s colon operator and for loops

  1. codesections says:

    Another thing that strikes me from this example is how much harm we do/extra complexity we inflict on ourselves by having chosen floating-point numbers as the default non-integer numerical representation. I’ve been writing more Raku lately, and the equivalent to `x = 0:0.1:1.2;` would be `my $x = 0, .1 … 1.2;`. In Raku, this Just Works™ – not because there’s any any special-casing of syntax in the compiler, but because the range returns one `Int` followed by a `Rat`s (Raku’s rational number type).

    I’m certainly not suggesting that we shouldn’t use floats – in many cases, they really are the best tool for the job. But I am starting to wonder if they’ve become too pervasive and if there aren’t some areas where true rational numbers would be a simpler abstraction (without unacceptable performance trade offs).

    In any event, thanks for the thought-provoking post!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.