Monte Carlo Methods

Experimental html version of downloadable textbook, see https://theartofhpc.com
\[ % mathjax inclusion. \newcommand\bbP{\mathbb{P}} \newcommand\bbR{\mathbb{R}} \newcommand\becomes{\mathop{:=}} \newcommand\dtdxx{\frac{\alpha\Delta t}{\Delta x^2}} \newcommand\defined{ \mathrel{\lower 5pt \hbox{${\equiv\atop\mathrm{\scriptstyle D}}$}}} \newcommand\fp[2]{#1\cdot10^{#2}} \newcommand\inv{^{-1}}\newcommand\invt{^{-t}} \newcommand\macro[1]{$\langle$#1$\rangle$} \newcommand\nobreak{} \newcommand\Rn{{\cal R}^n} \newcommand\Rnxn{{\cal R}^{n\times x}} \newcommand\sublocal{_{\mathrm\scriptstyle local}} \newcommand\toprule{\hline} \newcommand\midrule{\hline} \newcommand\bottomrule{\hline} \newcommand\multicolumn[3]{#3} \newcommand\defcolvector[2]{\begin{pmatrix} #1_0

#1_1


vdots

#1_{#2-1} \end{pmatrix}} % {
left(
begin{array}{c} #1_0

#1_1


vdots

#1_{#2-1}
end{array}
right) } \] 11.1 : Motivation
11.1.1 : What is the attraction?
11.2 : Examples
11.2.1 : Monte Carlo simulation of the Ising model
Back to Table of Contents

11 Monte Carlo Methods

Monte Carlo simulation is a broad term for methods that use random numbers and statistical sampling to solve problems, rather than exact modeling. From the nature of this sampling, the result will have some uncertainty, but the statistical `law of large numbers' will ensure that the uncertainty goes down as the number of samples grows.

An important tool for statistical sampling is a random number generator. See appendix  app:random for random number generation.

11.1 Motivation

crumb trail: > montecarlo > Motivation

Let's start with a simple example: measuring an area, for instance, $\pi$ is the area of a circle inscribed in a square with sides 2. If you picked a random point in the square, the chance of it falling in the circle is $\pi/4$, so you could estimate this ratio by taking many random points $(x,y)$ and seeing in what proportion their length $\sqrt{x^2+y^2}$ is less than 1.

You could even do this as a physical experiment: suppose you have a pond of an irregular shape in your backyard, and that the yard itself is rectangular with known dimensions. If you would now throw pebbles into your yard so that they are equally likely to land at any given spot, then the ratio of pebbles falling in the pond to those falling outside equals the ratio of the areas.

Less fanciful and more mathematically, we need to formalize the idea of falling inside or outside the shape you are measuring. Therefore, let $\Omega\in[0,1]^2$ be the shape, and let a function $f(\bar x)$ describe the boundary of $\Omega$, that is \begin{equation} \begin{cases} f(\bar x)<0&x\not\in\Omega\\ f(\bar x)>0&x\in\Omega\\ \end{cases} \end{equation} Now take random points $\bar x_0,\bar x_1,\bar x_2\in[0,1]^2$, then we can estimate the area of $\Omega$ by counting how often $f(\bar x_i)$ is positive or negative.

We can extend this idea to integration. The average value of a function on an interval $(a,b)$ is defined as \begin{equation} \langle f\rangle = \frac1{b-a}\int_a^bf(x)dx \end{equation} On the other hand, we can also estimate the average as \begin{equation} \langle f\rangle \approx \frac 1N\sum_{i=1}^nf(x_i) \end{equation} if the points $x_i$ are reasonably distributed and the function $f$ is not too wild. This leads us to \begin{equation} \int_a^bf(x)dx \approx (b-a) \frac 1N\sum_{i=1}^nf(x_i) \end{equation} Statistical theory, that we will not go into, tells us that the uncertainty $\sigma_I$ in the integral is related to the standard deviation $\sigma_f$ by \begin{equation} \sigma_I\sim \frac1{\sqrt N}\sigma_f \end{equation} for normal distributions.

11.1.1 What is the attraction?

crumb trail: > montecarlo > Motivation > What is the attraction?

So far, Monte Carlo integration does not look much different from classical integration by Riemann sums  . The difference appears when we go to higher dimensions. In that case, for classical integration we would need $N$ points in each dimension, leading to $N^d$ points in $d$ dimensions. In the Monte Carlo method, on the other hand, the points are taken at random from the $d$-dimensional space, and a much lower number of points suffices.

Computationally, Monte Carlo methods are attractive since all function evaluations can be performed in parallel.

The statistical law that underlies this is as follows: if $N$ independent observations are made of a quantity with standard deviation $\sigma$, then the standard deviation of the mean is $\sigma/\sqrt N$. This means that more observations will lead to more accuracy; what makes Monte Carlo methods interesting is that this gain in accuracy is not related to dimensionality of the original problem.

Monte Carlo techniques are of course natural candidatates for simulating phenomena that are statistical in nature, such as radioactive decay, or Brownian motion. Other problems where Monte Carlo simulation is attractive are outside the realm of scientific computing. For instance, the Black-Scholes model for stock option pricing   [BlackScholes] uses Monte Carlo simulation.

Some problems that you have seen before, such as solving a linear system of equations, can be tackled with Monte Carlo techniques. However, this is not a typical application. Below we will discuss two applications where exact methods would take far too much time to compute and where statistical sampling can quickly give a reasonably accurate answer.

11.2 Examples

crumb trail: > montecarlo > Examples

11.2.1 Monte Carlo simulation of the Ising model

crumb trail: > montecarlo > Examples > Monte Carlo simulation of the Ising model

The Ising model (for an introduction, see  [Cipra:Ising]  ) was originally proposed to model ferromagnetism. Magnetism is the result of atoms aligning their `spin' direction: let's say spin can only be `up' or `down', then a material has magnetism if more atoms have spin up than down, or the other way around. The atoms are said to be in a structure called a `lattice'.

Now imagine heating up a material, which loosens up the atoms. If an external field is applied to the material, the atoms will start aligning with the field, and if the field is removed the magnetism disappears again. However, below a certain critical temperature the material will retain its magnetism. We will use Monte Carlo simulation to find the stable configurations that remain.

Let's say the lattice $\Lambda$ has $N$ atoms, and we denote a configuration of atoms as $\sigma=(\sigma_1,\ldots,\sigma_N)$ where each $\sigma_i=\pm1$. The energy of a lattice is modeled as \begin{equation} H=H(\sigma)=-J\sum_i\sigma_i-E\sum_{ij}\sigma_i\sigma_j. \end{equation} The first term models the interaction of individual spins $\sigma_i$ with an external field of strength $J$. The second term, which sums over nearest neighbour pairs, models alignment of atom pairs: the product $\sigma_i\sigma_j$ is positive if the atoms have identical spin, and negative if opposite.

In statistical mechanics  , the probability of a configuration is \begin{equation} P(\sigma) = \exp(-H(\sigma))/Z \end{equation} where the `partitioning function' $Z$ is defined as \begin{equation} Z = \sum_\sigma \exp(H(\sigma)) \end{equation} where the sum runs over all $2^N$ configurations.

A configuration is stable if its energy does not decrease under small perturbations. To explore this, we iterate over the lattice, exploring whether altering the spin of atoms lowers the energy. We introduce an element of chance to prevent artificial solutions. (This is the Metropolis algorithm   [Metropolis]  .)

\begin{displayalgorithm} \For{fixed number of iterations}{ \For{each atom $i$}{ {calculate the change $\Delta E$ from changing the sign of $\sigma_i$}\; \If{$\Delta E<$ or $\exp(-\Delta E)$ greater than some random number}{ accept the change}\; } } \end{displayalgorithm}

This algorithm can be parallelized, if we notice the similarity with the structure of the sparse matrix-vector product. In that algorithm too we compute a local quantity by combining inputs from a few nearest neighbours. This means we can partitioning the lattice, and compute the local updates after each processor collects a \indexterm{ghost region}.

Having each processor iterate over local points in the lattice corresponds to a particular global ordering of the lattice; to make the parallel computation equivalent to a sequential one we also need a parallel random generator (section  18.3  ).

Back to Table of Contents