# Common random variables: an optimization case study

2018/03/23Code used in this post was written for Julia **v0.6.2** and, when applicable, reflects the state of packages used on **2018-03-23**. You may need to modify it for different versions.

## Motivation

When simulating models with random components, it is frequently advantageous to decompose the structure into

*parameters*\(\theta\) that we need to characterize structural relationships,*noise*\(\varepsilon\) that is independent of parameters,a function \(f(\theta, \varepsilon)\) that generates observed data or moments.

Doing this allows us to use common random variables (aka common random *numbers*), which is a technique which simply keeps \(\varepsilon\) fixed for different \(\theta\). This can help with making \(f\) differentiable, which allows the use of derivative-based optimization algorithms (eg for maximum likelihood or MAP) or derivative-based MCMC methods. It is also used to reduce the variance of simulated moments.

When implementing this technique in Julia, I frequently create a wrapper structure that saves the \(\varepsilon\), allocating an `Array`

which can be updated in place. Since a redesign of DynamicHMC.jl is coming up which will accommodate simulated likelihood methods in a more disciplined manner, and I wanted to explore other options, most importantly StaticArrays.jl.

Here I benchmark the alternatives for Julia `v0.6.2`

using a simple toy model. **TL;DR** for the impatient: `StaticArrays.jl`

is 150x faster, and this does not depend on using immutable or mutable `StaticArray`

s, or even reallocating every time new \(\varepsilon\)s are generated.

## Problem setup

The setup is simple: we draw \(M\) observations, and our noise is

\[ \varepsilon_{i,j} \sim N(0, 1) \qquad \text{for } i = 1, \dots, M; j = 1, \dots, 7. \]

Our parameters are \(\mu\) and \(\sigma\), and for each \(i\) we calculate

\[ A_i = \frac17 \sum_{j=1}^7 \exp(\mu + \sigma \varepsilon_{i,j}) \]

which is the sample average after a nonlinear transformation. The \(7\) is a bit accidental, it comes from simplifying an actual problem I am working on. We are interested in the sample average for \(A_i\). I deliberately refrain micro-optimizing each version, to reflect how I would approach a real-life problem.

We code the common interface as

```
using BenchmarkTools
using StaticArrays
using Parameters
# common interface
"Dimension of noise ``ϵ`` for each observation."
const EDIM = 7
"""
Common random variables. The user needs to define
1. `observation_moments`, which should use `observation_moment`,
2. `newcrv = update!(crv)`, which returns new common random variables,
potentially (but not necessarily) overwriting `crv`.
"""
abstract type CRVContainer end
observation_moment(ϵ, μ, σ) = mean(@. exp(μ + σ * ϵ))
average_moment(crv::CRVContainer, μ, σ) = mean(observation_moments(crv, μ, σ))
"Calculate statistics, making `N` draws, updating every `L`th time."
function stats(crv, μ, σ, N, L)
_stat() = (N % L == 0 && (crv = update!(crv)); average_moment(crv, μ, σ))
[_stat() for _ in 1:N]
end
```

The way I wrote `stats`

is representative of how I use HMC/NUTS: simulated moments on the same trajectory are calculated with the same \(\varepsilon\)s, which are updated for each trajectory. Of course, the parameters would change along the trajectory; here they don't, but this does not affect the benchmarks.

The semantics of `update!`

allows *both* in-place modifications and a functional style.

## Using a preallocated matrix

This is the standard way I would write this.^{1} \(\varepsilon\)s are in columns of a matrix, which is preferable for mapping them as slices, then they are mapped using `observation_moment`

.

`update!`

overwrites the contents.

```
"""
Common random variables are stored in columns of a matrix.
"""
struct PreallocatedMatrix{T} <: CRVContainer
ϵ::Matrix{T}
end
PreallocatedMatrix(M::Int) = PreallocatedMatrix(randn(EDIM, M))
update!(p::PreallocatedMatrix) = (randn!(p.ϵ); p)
observation_moments(p::PreallocatedMatrix, μ, σ) =
vec(mapslices(ϵ -> observation_moment(ϵ, μ, σ), p.ϵ, 1))
```

## Using static vectors

We share the following between various static vector implementations:

```
"Common random variables as a vector of vectors, in the `ϵs`."
abstract type CRVVectors <: CRVContainer end
observation_moments(p::CRVVectors, μ, σ) =
map(ϵ -> observation_moment(ϵ, μ, σ), p.ϵs)
```

I find the use of `map`

more intuitive here than `mapslices`

above.

### Static vectors, container preallocated

In the first version using static vectors, we keep `SVector`

in a `Vector`

, and update in place.

```
struct PreallocatedStaticCRV{K, T} <: CRVVectors
ϵs::Vector{SVector{K, T}}
end
PreallocatedStaticCRV(M::Int) =
PreallocatedStaticCRV([@SVector(randn(EDIM)) for _ in 1:M])
function update!(p::PreallocatedStaticCRV)
@unpack ϵs = p
@inbounds for i in indices(ϵs, 1)
ϵs[i] = @SVector(randn(EDIM))
end
p
end
```

### Mutable static vectors, overwritten

We modify this to use mutable vectors — this should not make a difference.

```
struct MutableStaticCRV{K, T} <: CRVVectors
ϵs::Vector{MVector{K, T}}
end
MutableStaticCRV(M::Int) =
MutableStaticCRV([@MVector(randn(EDIM)) for _ in 1:M])
function update!(p::MutableStaticCRV)
@unpack ϵs = p
@inbounds for i in indices(ϵs, 1)
randn!(ϵs[i])
end
p
end
```

### Static vectors, allocated each time

Finally, for the third solution,

```
struct GeneratedStaticCRV{K, T} <: CRVVectors
ϵs::Vector{SVector{K, T}}
end
GeneratedStaticCRV(M::Int) =
GeneratedStaticCRV([@SVector(randn(EDIM)) for _ in 1:M])
update!(p::GeneratedStaticCRV{K, T}) where {K, T} =
GeneratedStaticCRV([@SVector(randn(T, K)) for _ in indices(p.ϵs, 1)])
```

## Results

Running

```
@btime mean(stats(PreallocatedMatrix(100), 1.0, 0.1, 100, 10))
@btime mean(stats(PreallocatedStaticCRV(100), 1.0, 0.1, 100, 10))
@btime mean(stats(MutableStaticCRV(100), 1.0, 0.1, 100, 10))
@btime mean(stats(GeneratedStaticCRV(100), 1.0, 0.1, 100, 10))
```

we obtain

solution | time | allocations |
---|---|---|

`PreallocatedMatrix` | `230 ms` | `22 MiB` |

`PreallocatedStaticCRV` | `1.5 ms` | `102 KiB` |

`MutableStaticCRV` | `1.5 ms` | `104 KiB` |

`GeneratedStaticCRV` | `1.5 ms` | `666 KiB` |

As a preview of future improvements, I tried `PreallocatedMatrix`

on current `master`

(which will become Julia `v0.7`

, obtaining `3.5 ms`

(`2.46 MiB`

), which is really promising.^{2}

The conclusion is that `StaticArrays`

simplifies and speeds up my code *at the same time*. I especially like the last version (`GeneratedStaticCRV`

), because it obviates the need to think about types in advance. While here the example is simple, in practice I would use automatic differentiation, which makes it more challenging to determine buffer types in advance. I expect I will transition to a more “buffer-free” style in the future, and design the interface for DynamicHMC.jl accordingly.

download code as code.jl

**PS:** From now on, my blog posts with Julia code will have a banner about version information.

- Which will change following this blog post 😁
^{[return]} - The other 3 options are slow because of deprecation warnings.
^{[return]}