## Array resizing in MATLAB

I encountered the following MATLAB code recently, simplified for this discussion; it builds a 3-by-4-by-2 array by assigning each of its three 4-by-2 “block” sub-arrays in turn to an initially empty array:

```sz = [3, 4, 2];
block = reshape(0:(prod(sz(2:end)) - 1), sz(2:end));
a = [];
for k = 1:sz(1)
a(k,:,:) = block; block = block + numel(block);
end
```

As this example shows, MATLAB allows resizing arrays on the fly, so to speak; assignment to an element– or in this case, a range of elements– whose indices would normally be out of bounds automatically re-allocates memory to accommodate the new, larger array.  However, these incremental re-allocation and re-copying operations can be costly in terms of execution time.  Granted, this toy example is small enough not to matter, but in the “real” example, the eventual array size was approximately `sz=[256,16384,4]`, in which case executing the above code takes about 8 seconds on my laptop.

Pre-allocation is usually recommended as a fix: that is, size the entire array in advance, before assigning any elements.  But what if you don’t know the size of the array in advance?  Although there are several approaches to dealing with this situation, with varying complexity, the subject of this post is to describe just how much execution time may be eliminated by only a modest change to the above code.

A major source of slowness is that the array expansion occurs along the first dimension… which in MATLAB, with its Fortran roots, is the index that changes most frequently in the underlying block of memory containing the array elements.  So not only must we re-allocate memory to accommodate each new sub-array, even copying the existing array elements is a non-trivial task, as the following figure shows.  As we “append” first the cyan block, then the red block, then the green block, each incremental re-allocation also requires “re-interleaving” the old and new elements in the new larger block of memory:

Expanding an eventual 3x4x2 array in MATLAB along the first dimension.

Instead, consider the following code, modified to append each block along the last— and most slowly changing– dimension, followed by the appropriate transposition via `permute` to shuffle everything back into the originally intended shape:

```a = [];
for k = 1:sz(1)
a(:,:,k) = block; block = block + numel(block);
end
a = permute(a, [3, 1, 2]);
```

The effect of this change is that, although we still incur all of the incremental re-allocation cost, each memory copy operation is a more straightforward and much faster “blit”:

Expanding an eventual 3x4x2 array in MATLAB along the last dimension, followed by transposing to yield the desired order of dimensions.

This version executes in less than a quarter of a second on my laptop, about 35 times faster than the original 8 seconds.

This entry was posted in Uncategorized. Bookmark the permalink.

### 4 Responses to Array resizing in MATLAB

1. Clever approach and great illustrations. The two images alone communicate 90% of the message.

2. Szymon says:

These concepts are also nicely described in the Accelerating Matlab Performance book by Yair Altman. You can do some good speedups with permute and reshape commands.

3. gtownescapee says:

Would a cell array of 2D matrices have been more efficient in your real application? The advantage of using a cell array is you wouldn’t have to grow a large contiguous block of memory, but you may sacrifice some memory access time if you need to slice or I index across cells.

• This is a good point; you’re right that appending to cell arrays are really fast (the only contiguous memory block that must be grown is the array of pointers to cell elements). However, in this case, that last permute() transposition is critical: although we are “building” the array in 16384×4 blocks at a time, once the array is created, we want to do fast arithmetic operations on the 256×16384 “slices” of it.

This site uses Akismet to reduce spam. Learn how your comment data is processed.