(Editor Note: These notes are not my own text they are copied from MacKay and the Astrophysics source below. They’ll be gradually edited over the next few months I provide them because writing them was useful for me)
The Hamiltonian Monte Carlo is a Metropolis method, applicable to continuous state spaces, that makes use of gradient information to reduce random walk behaviour.
These notes are based on a combination of http://python4mpia.github.io/index.html and David MacKays book on Inference and Information theory.
For many systems whose probability can be written in the form
not only but also its gradient with respect to
can be readily evaluated. It seems wasteful to use a simple random-walk
Metropolis method when this gradient is available – the gradient indicates
which direction one should go in to find states that have higher probability!
In the Hamiltonian Monte Carlo method, the state space x is augmented by
momentum variables p, and there is an alternation of two types of proposal.
The first proposal randomize the momentum variable, leaving the state x un-changed. The second proposal changes both x and p using simulated Hamiltonian dynamics as defined by the Hamiltonian
where is a ‘kinetic energy’ such as These two proposals are used to create
(asymptotically) samples from the joint density
This density is separable, so the marginal distribution of is the
desired distribution . So, simply discarding the momentum variables, we obtain a sequence of samples that asymptotically come from
An algorithm for Hamiltonian Monte-Carlo
First, we need to compute the gradient of our objective function.
Hamiltonian Monte-Carlo makes use of the fact, that we can write our likelihood as
where is the “energy”. The algorithm then uses
Hamiltonian dynamics to modify the way how candidates are proposed:
The true value we used to generate the data was . The Monte-Carlo estimate is . Mean: 2.35130302322 Sigma: 0.00147375480435 Acceptance rate = 0.645735426457