Random numbers

Experimental html version of downloadable textbook, see https://theartofhpc.com
\[ % mathjax inclusion. \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\becomes{\mathop{:=}} \newcommand\dtdxx{\frac{\alpha\Delta t}{\Delta x^2}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \newcommand\fp[2]{#1\cdot10^{#2}} \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\macro[1]{$\langle$#1$\rangle$} \newcommand\nobreak{} \newcommand\Rn{{\cal R}^n} \newcommand\Rnxn{{\cal R}^{n\times x}} \newcommand\sublocal{_{\mathrm\scriptstyle local}} \newcommand\toprule{\hline} \newcommand\midrule{\hline} \newcommand\bottomrule{\hline} \newcommand\multicolumn[3]{#3} \newcommand\defcolvector[2]{\begin{pmatrix} #1_0

#1_1


vdots

#1_{#2-1} \end{pmatrix}} % {
left(
begin{array}{c} #1_0

#1_1


vdots

#1_{#2-1}
end{array}
right) } \]

18.1 : Random Number Generation
18.1.1 : Sequential random number generators
18.2 : Random numbers in programming languages
18.2.1 : C
18.2.1.1 : Problems with the C random generator
18.2.2 : C++
18.2.2.1 : Random floats
18.2.2.2 : Dice throw
18.2.2.3 : Poisson distribution
18.2.3 : Fortran
18.2.4 : Python
18.3 : Parallel random number generation
18.3.1 : Manager-worker generator
18.3.2 : Sequence splitting solutions
18.3.3 : Blocked random number generators
18.3.4 : Keyed random number generators
18.3.5 : Mersenne twister
Back to Table of Contents

18 Random numbers

Random number generation is useful, for generating random test data, or in Monte Carlo simulation ; see chapter  app:montecarlo  .

Here we discuss random number generators in general, their use in programming languages, and the problems of parallel random number generationn.

18.1 Random Number Generation

crumb trail: > random > Random Number Generation

Random numbers are often used in simulations as some examples below will show. True random numbers are very hard to obtain: they could be generated by measuring quantum processes such as radioactive particles. Starting with the Intel Ivy Bridge  , Intel's processors have a hardware random number generator based on thermal noise  [Cryptography:IvyRandom2012]  .)

The most common solution is to use pseudo-random numbers \index{pseudo-random numbers|see{random numbers}}. This means that we use a deterministic mathematical process, that is sufficiently irregular that for practical purposes no order can be found in it.

18.1.1 Sequential random number generators

crumb trail: > random > Random Number Generation > Sequential random number generators

An easy way to generate random numbers (we leave off the `pseudo' qualification) is to use linear congruential generators (for all you ever need to know about random numbers, see Knuth  [Knuth:vol2]  ), recurrences of the form \begin{equation} x_{k+1} = (ax_k+b) \mod m. \end{equation} This sequence is periodic, since it consists of nonnegative integers at most $m-1$, and with period $m$ under certain conditions. A typical period is $2^{31}$. The starting point $x_0$ of the series is known as the `seed'. Software for random numbers often lets you specify the seed. To get reproducible results you would run your program with the same seed multiple times; to get random behavior over multiple runs of your program you could for instance derive the seed from clock and calendar functions.

Linear congruential generators may have some amount of correlation between lower bits. A different principle of generating random numbers is the lagged Fibonacci random number generator \begin{equation} X_i = X_{i-p}\otimes X_{i-q} \end{equation} where $p,q$ are the lag parameter, and $\otimes$ is any binary operation, such as addition or multiplication modulo $M$.

The main problems with lagged Fibonacci generators are:

18.2 Random numbers in programming languages

crumb trail: > random > Random numbers in programming languages

18.2.1 C

crumb trail: > random > Random numbers in programming languages > C

There is an easy (but not terribly great) random number generator that works the same in C and C++.

#include <random>
using std::rand;
float random_fraction =
    (float)rand()/(float)RAND_MAX;
The function rand yields an int

-- a different one every time you call it -- in the range from zero to RAND_MAX  . Using scaling and casting you can then produce a fraction between zero and one with the above code.

If you run your program twice, you will twice get the same sequence of random numbers. That is great for debugging your program but not if you were hoping to do some statistical analysis. Therefore you can set the random number seed from which the random sequence starts by the srand function. Example:

srand(time(NULL));
seeds the random number generator from the current time. This call should happen only once, typically somewhere high up in your main.

18.2.1.1 Problems with the C random generator

crumb trail: > random > Random numbers in programming languages > C > Problems with the C random generator

\includegraphics{rand7mod3}

FIGURE 18.1: Low number bias of a random number generator taken module.

18.2.2 C++

crumb trail: > random > Random numbers in programming languages > C++

The STL has a random number generator that is more general and more flexible than the C version.

First you declare an engine; later this will be transformed into a distribution:

std::default_random_engine generator;

This generator will start at the same value every time. You can seed it:

std::random_device r;
std::default_random_engine generator{ r() };

Next, you need to declare the distribution. For instance, a uniform distribution between given bounds:

std::uniform_real_distribution<float> distribution(0.,1.);
A roll of the dice would result from:
std::uniform_int_distribution<int> distribution(1,6);

18.2.2.1 Random floats

crumb trail: > random > Random numbers in programming languages > C++ > Random floats

// seed the generator
std::random_device r;
// std::seed_seq ssq{r()};
// and then passing it to the engine does the same

// set the default random number generator
std::default_random_engine generator{r()};

// distribution: real between 0 and 1
std::uniform_real_distribution<float> distribution(0.,1.);

cout << "first rand: " << distribution(generator) << endl;

18.2.2.2 Dice throw

crumb trail: > random > Random numbers in programming languages > C++ > Dice throw

// set the default generator
std::default_random_engine generator;

// distribution: ints 1..6
std::uniform_int_distribution<int> distribution(1,6);

// apply distribution to generator:
int dice_roll = distribution(generator);
  // generates number in the range 1..6 

18.2.2.3 Poisson distribution

crumb trail: > random > Random numbers in programming languages > C++ > Poisson distribution

Another distribution is the Poisson distribution :

std::default_random_engine generator;
float mean = 3.5;
std::poisson_distribution<int> distribution(mean);
int number = distribution(generator);

18.2.3 Fortran

crumb trail: > random > Random numbers in programming languages > Fortran

In this section we briefly discuss the Fortran random number generator  . The basic mechanism is through the library subroutine random_number  , which has a single argument of type

REAL with INTENT(OUT) :

real(4) :: randomfraction
call random_number(randomfraction)
The result is a random number from the uniform distribution on $\left[0,1\right)$.

Setting the random seed is slightly convoluted. The amount of storage needed to store the seed can be processor and implementation-dependent, so the routine random_seed can have three types of named argument, exactly one of which can be specified at any one time. The keyword can be:

A typical fragment for setting the seed would be:
integer :: seedsize
integer,dimension(:),allocatable :: seed

call random_seed(size=seedsize)
allocate(seed(seedsize))
seed(:) = ! your integer seed here
call random_seed(put=seed)

18.2.4 Python

crumb trail: > random > Random numbers in programming languages > Python

Python has a random module:

import random
x = random.random()
i = random.randint(lo,hi)

18.3 Parallel random number generation

crumb trail: > random > Parallel random number generation

Random number generation is problematic in parallel. To see this, consider a parallel process that uses a random number generator on each subprocess, and consider a single processor emulating the parallel process. Now this single process in effect has a random number generator that consists of interleaving the parallel generator results. This means that, if we use the same generator in all parallel processes, the effective generator over the whole process will produce stretches of identical values.

There are various ways out.

18.3.1 Manager-worker generator

crumb trail: > random > Parallel random number generation > Manager-worker generator

We can generate the random numbers centrally. In shared memory that could mean making its operation atomic. This may introduce a serious bottleneck.

Exercise Critical sections are usually justified if the work spent there is of lower order than the parallel work. Why does that argument not apply here.
End of exercise

Another solution would be to have one thread or process that generates the random numbers and distributes them to the other processes. Doing this on a number-by-number basis causes considerable overhead. Instead, it would be possible for the generator process to distribute blocks of numbers. However, the manner in which this is done may again cause correlation between processes.

18.3.2 Sequence splitting solutions

crumb trail: > random > Parallel random number generation > Sequence splitting solutions

A better solution is to set up independent generators with parameter choices that guarantee statistical randomness. This is not simple. For instance, if two sequences $x^{(1)}_i,x^{(2)}_i$ have the same values of $a,b,m$, and their starting points are close together, the sequences will be strongly correlated. Less trivial examples of correlation exist.

Various techniques for random number generation exist, such as using two sequences, where one generates the starting points for the other sequence, which is the one actually used in the simulation. Software for parallel random number generator can be found at

http://sprng.cs.fsu.edu/   [Mascagni:SPRNG]  .

If it is possible given $x_i$ to compute $x_{i+k}$ cheaply, one use a leapfrogging technique, where $k$ processes have disjoint series $i\mapsto x_{s_k+ik}$ where $x_{s_k}$ is the starting point for the $k$-th series.

18.3.3 Blocked random number generators

crumb trail: > random > Parallel random number generation > Blocked random number generators

Some random number generators (see  [LEcuyer:multiple-random]  ) allow you to calculate a value that is many iteration away from the seed. You could then take the block of values from the seed to that iteration and give it to one processor. Similarly, each processor would get a contiguous block of iterations of the generator.

18.3.4 Keyed random number generators

crumb trail: > random > Parallel random number generation > Keyed random number generators

Some random number generators are not recursive, but allow an explicit formulation \begin{equation} x_n = f(n), \end{equation} that is, the $n$-th number is a function of its `key', $n$. Adding a block key into the equation \begin{equation} x_n^{(k)} = f_k(n) \end{equation} allows for parallelism over processes indexed by $k$. See  [Salmon:prng123]  .

18.3.5 Mersenne twister

crumb trail: > random > Parallel random number generation > Mersenne twister

The Mersenne twister random number generator has been adapted to allow for parallel streams of uncorrelated numbers  [Matsumoto:DynamicMersenne]  . Here the process ID is encoded into the generator.

Back to Table of Contents