Last week a co-worker observed confusing behavior from the uniform random number generator in Simulink. The output of the generator “did not seem random.” This is what she saw:
This is certainly uniform, but I would agree that this does not appear to be “random.” (Note: there is a problem with the setup here, though, which we will get to shortly. Can you see it?) I wanted to understand this behavior better, and in the process I found some interesting things about the algorithm used in the generator.
I don’t use Simulink. But I do use MATLAB, which supports– and documents— several different algorithms for generating random numbers. This is why I was surprised to find that the corresponding documentation on Simulink’s uniform random number generator provides no detail, let alone configurable options, regarding the algorithm used. [Edit: Looking at this again, the documentation has now been updated to indicate the algorithm being used, described below.]
Fortunately, we can reverse-engineer the algorithm without too much detective work. First, the regularity observed in the figure above strongly suggests a linear congruential generator, and even more fortunately, MATLAB provides such a generator that, after some experimenting, exactly reproduces the behavior in Simulink. That generator is the ‘mcg16807’ or ‘v4’ legacy generator that was the default through MATLAB version 4.0. In other words, the latest version of Simulink today provides only a single generator… that MATLAB left behind over 15 years ago. Huh.
The ‘mcg16807’ generator is essentially the MINSTD generator, in which the state of the generator
where in this case
So far, none of this is new or interesting. The interesting wrinkle is in how the generator is seeded. In all implementations I have seen, the seed
How can we tell? Because if we were to simply set the generator’s initial state equal to the seed, as is usually the case, then the figure above would have looked even worse:
These 100 “random” values simply increase linearly in really small steps of
Getting back to the problem of determining the more complex seeding algorithm in the Simulink generator: we have a “gray box” random number generator that, when seeded with a value
By solving this congruence for a few appropriately selected seeds, we get a collection of
Following is MATLAB code that reproduces the behavior of the Simulink generator:
classdef MCG16807 < handle properties a = 7^5; % 16807 m = 2^31 - 1; state; end methods function self = MCG16807(s) self.seed(s); end function  = seed(self, s) switch s case 0 self.state = 1144108930; case 2^32 - 1 self.state = self.m - 1; otherwise self.state = min(rem(s * 2^16 + floor(s / 2^16) + ... bitand(s, 2^15), 2^31), self.m - 1); end end function r = rand(self) self.state = rem(self.a * self.state, self.m); r = self.state / self.m; end end end
I suppose swapping word halves like this is understandable, since it is a simple way to mitigate somewhat the undesirable effect above, where often-used “nearby” seeds yield very “nearby” sub-sequences of random draws. But in an attempt to be helpful, I think it ends up merely hiding the still-present potential danger inherent in this mis-use of a sub-standard generator.
The “twist,” though, is that weird extra addition of bitand(s, 2^15). That is, to compute the state from the seed, we do not just swap word halves, we also add (arithmetically, not bit-wise) the masked 15th bit of the seed. But why?
The reasoning behind this oddly-specific-yet-seemingly-arbitrary extra step is not clear to me. My guess, though– which is admittedly cynically biased– is that it has the feel of a rather half-hearted attempt to further “mix” the transformation from seed to state… in a way that rarely has a useful effect.