# Introduction

This package serves two purposes:

Introduce a common API for packages that operate on

*log densities*, which for these purposes are black box $\mathbb{R}^n \to \mathbb{R}$ mappings. Using the interface of introduced in this package, you can query $n$, evaluate the log density and optionally its gradient, and determine if a particular object supports these methods using traits.**This usage is relevant primarily for package developers**who write generic algorithms that use (log) densities that correspond to posteriors and likelihoods, eg MCMC, MAP, ML. An example is DynamicHMC.jl. This is documented in the API section.Make it easier for

**users who want to perform inference**using the above methods (and packages) to*define their own log densities*, either taking a vector of real numbers as input, or extracting and transforming parameters using the TransformVariables.jl package,*obtain gradients*of these log densities using one of the supported*automatic differentiation*packages of Julia.

This is documented in the next section, with a worked example.

For the purposes of this package, *log densities* are still valid when shifted by a constant that may be unknown, but is consistent within calls. This is necessary for Bayesian inference, where log posteriors are usually calculated up to a constant. See `LogDensityProblems.logdensity`

for details.

# Working with log density problems

Consider an inference problem where IID draws are obtained from a normal distribution,

\[x_i \sim N(\mu, \sigma)\]

for $i = 1, \dots, N$. It can be shown that the *log likelihood* conditional on $\mu$ and $\sigma$ is

\[\ell = -N\log \sigma - \sum_{i = 1}^N \frac{(x-\mu)^2}{2\sigma^2} = -N\left( \log \sigma + \frac{S + (\bar{x} - \mu)^2}{2\sigma^2} \right)\]

where we have dropped constant terms, and defined the sufficient statistics

\[\bar{x} = \frac{1}{N} \sum_{i = 1}^N x_i\]

and

\[S = \frac{1}{N} \sum_{i = 1}^N (x_i - \bar{x})^2\]

Finally, we use priors

\[\mu \sim N(0, 5), \sigma \sim N(0, 2)\]

which yield the log prior

\[-\sigma^2/8 - \mu^2/50\]

which is added to the log likelihood to obtain the log posterior.

It is useful to define a *callable* that implements this, taking some vector `x`

as an input and calculating the summary statistics, then, when called with a `NamedTuple`

containing the parameters, evaluating to the log posterior.

```
using Statistics, UnPack # imported for our implementation
struct NormalPosterior{T} # contains the summary statistics
N::Int
x̄::T
S::T
end
# calculate summary statistics from a data vector
function NormalPosterior(x::AbstractVector)
NormalPosterior(length(x), mean(x), var(x; corrected = false))
end
# define a callable that unpacks parameters, and evaluates the log likelihood
function (problem::NormalPosterior)(θ)
@unpack μ, σ = θ
@unpack N, x̄, S = problem
loglikelihood = -N * (log(σ) + (S + abs2(μ - x̄)) / (2 * abs2(σ)))
logprior = - abs2(σ)/8 - abs2(μ)/50
loglikelihood + logprior
end
problem = NormalPosterior(randn(100))
```

Let's try out the posterior calculation:

`julia> problem((μ = 0.0, σ = 1.0))`

`-48.60297471151442`

Just evaluating your log density function like above is a great way to test and benchmark your implementation. See the “Performance Tips” section of the Julia manual for optimization advice.

## Using the TransformVariables package

In our example, we require $\sigma > 0$, otherwise the problem is meaningless. However, many MCMC samplers prefer to operate on *unconstrained* spaces $\mathbb{R}^n$. The TransformVariables package was written to transform unconstrained to constrained spaces, and help with the log Jacobian correction (more on that later). That package has detailed documentation, now we just define a transformation from a length 2 vector to a `NamedTuple`

with fields `μ`

(unconstrained) and `σ > 0`

.

Since version 1.0, TransformedLogDensity has been moved to the package TransformedLogDensities.

`julia> using LogDensityProblems, TransformVariables, TransformedLogDensities`

`julia> ℓ = TransformedLogDensity(as((μ = asℝ, σ = asℝ₊)), problem)`

`TransformedLogDensity of dimension 2`

Then we can query the dimension of this problem, and evaluate the log density:

`julia> LogDensityProblems.dimension(ℓ)`

`2`

`julia> LogDensityProblems.logdensity(ℓ, zeros(2))`

`-48.60297471151442`

Before running time-consuming algorithms like MCMC, it is advisable to test and benchmark your log density evaluations separately. The same applies to `LogDensityProblems.logdensity_and_gradient`

.

## Manual unpacking and transformation

If you prefer to implement the transformation yourself, you just have to define the following three methods for your problem: declare that it can evaluate log densities (but not their gradient, hence the `0`

order), allow the dimension of the problem to be queried, and then finally code the density calculation with the transformation. (Note that using `TransformedLogDensities.TransformedLogDensity`

takes care of all of these for you, as shown above).

```
function LogDensityProblems.capabilities(::Type{<:NormalPosterior})
LogDensityProblems.LogDensityOrder{0}()
end
LogDensityProblems.dimension(::NormalPosterior) = 2
function LogDensityProblems.logdensity(problem::NormalPosterior, x)
μ, logσ = x
σ = exp(logσ)
problem((μ = μ, σ = σ)) + logσ
end
```

`julia> LogDensityProblems.logdensity(problem, zeros(2))`

`-48.60297471151442`

Here we use the exponential function to transform from $\mathbb{R}$ to the positive reals, but this requires that we correct the log density with the *logarithm* of the Jacobian, which here happens to be $\log(\sigma)$.

## Automatic differentiation

Using either definition, you can now transform to another object which is capable of evaluating the *gradient*, using automatic differentiation. The wrapper for this is

`LogDensityProblems.ADgradient`

— Function```
ADgradient(kind, P; kwargs...)
```

Wrap `P`

using automatic differentiation to obtain a gradient.

`kind`

is usually a `Val`

type with a symbol that refers to a package, for example

```
ADgradient(Val(:ForwardDiff), P)
ADgradient(Val(:ReverseDiff), P)
ADgradient(Val(:Zygote), P)
```

Some methods may be loaded only conditionally after the relevant package is loaded (eg `using Zygote`

).

The symbol can also be used directly as eg

`ADgradient(:ForwardDiff, P)`

and should mostly be equivalent if the compiler manages to fold the constant.

The function `parent`

can be used to retrieve the original argument.

```
ADgradient(:ForwardDiff, ℓ; chunk, gradientconfig)
ADgradient(Val(:ForwardDiff), ℓ; chunk, gradientconfig)
```

Wrap a log density that supports evaluation of `Value`

to handle `ValueGradient`

, using `ForwardDiff`

.

Keywords are passed on to `ForwardDiff.GradientConfig`

to customize the setup. In particular, chunk size can be set with a `chunk`

keyword argument (accepting an integer or a `ForwardDiff.Chunk`

).

```
ADgradient(:Tracker, ℓ)
ADgradient(Val(:Tracker), ℓ)
```

Gradient using algorithmic/automatic differentiation via Tracker.

This package has been deprecated in favor of Zygote, but we keep the interface available.

```
ADgradient(:Zygote, ℓ)
ADgradient(Val(:Zygote), ℓ)
```

Gradient using algorithmic/automatic differentiation via Zygote.

Now observe that we can obtain gradients, too:

`julia> import ForwardDiff`

`julia> ∇ℓ = ADgradient(:ForwardDiff, ℓ)`

`ForwardDiff AD wrapper for TransformedLogDensity of dimension 2, w/ chunk size 2`

`julia> LogDensityProblems.capabilities(∇ℓ)`

`LogDensityProblems.LogDensityOrder{1}()`

`julia> LogDensityProblems.logdensity_and_gradient(∇ℓ, zeros(2))`

`(-48.60297471151442, [2.8381559225185673, -2.294050576971158])`

## Manually calculated derivatives

If you prefer not to use automatic differentiation, you can wrap your own derivatives following the template

```
function LogDensityProblems.capabilities(::Type{<:NormalPosterior})
LogDensityProblems.LogDensityOrder{1}() # can do gradient
end
LogDensityProblems.dimension(::NormalPosterior) = 2 # for this problem
function LogDensityProblems.logdensity_and_gradient(problem::NormalPosterior, x)
logdens = ...
grad = ...
logdens, grad
end
```

If the gradient is a mutable vector (eg `Vector`

), it should not be reused for another purpose. Practically, each call to `LogDensityProblems.logdensity_and_gradient`

should allocate a new one, or use immutables like `StaticArrays.SVector`

for small dimensions.

# Various utilities

You may find these utilities useful for debugging and optimization.

`LogDensityProblems.stresstest`

— Function```
stresstest(f, ℓ; N, rng, scale)
```

Test `ℓ`

with random values.

`N`

random vectors are drawn from a standard multivariate Cauchy distribution, scaled with `scale`

(which can be a scalar or a conformable vector).

Each random vector is then used as an argument in `f(ℓ, ...)`

. `logdensity`

and `logdensity_and_gradient`

are recommended for `f`

.

In case the call produces an error, the value is recorded as a failure, which are returned by the function.

Not exported, but part of the API.

`LogDensityProblems.benchmark_ForwardDiff_chunks`

— Function```
benchmark_ForwardDiff_chunks(ℓ; chunks, markprogress, x)
```

Benchmark a log density problem with various chunk sizes using ForwardDiff.

`chunks`

, which defaults to all possible chunk sizes, determines the chunks that are tried.

The function returns `chunk => time`

pairs, where `time`

is the benchmarked runtime in seconds, as determined by `BenchmarkTools.@belapsed`

. The gradient is evaluated at `x`

(defaults to zeros).

*Runtime may be long* because of tuned benchmarks, so when `markprogress == true`

(the default), dots are printed to mark progress.

This function is not exported, but part of the API.

*It is loaded conditionally when both ForwardDiff and BenchmarkTools are loaded.*

# Log densities API

Use the functions below for evaluating gradients and querying their dimension and other information. These symbols are not exported, as they are mostly used by package developers and in any case would need to be `import`

ed or qualified to add methods to.

`LogDensityProblems.capabilities`

— Function```
capabilities(T)
```

Test if the type (or a value, for convenience) supports the log density interface.

When `nothing`

is returned, it doesn't support this interface. When `LogDensityOrder{K}()`

is returned (typically with `K == 0`

or `K = 1`

), derivatives up to order `K`

are supported. *All other return values are invalid*.

**Interface description**

The following methods need to be implemented for the interface:

`dimension`

returns the*dimension*of the domain,`logdensity`

evaluates the log density at a given point.`logdensity_and_gradient`

when`K ≥ 1`

.

See also `LogDensityProblems.stresstest`

for stress testing.

`LogDensityProblems.LogDensityOrder`

— Type`struct LogDensityOrder{K}`

A trait that means that a log density supports evaluating derivatives up to order `K`

.

Typical values for `K`

are `0`

(just the log density) and `1`

(log density and gradient).

`LogDensityProblems.dimension`

— Function`dimension(ℓ)`

Dimension of the input vectors `x`

for log density `ℓ`

. See `logdensity`

, `logdensity_and_gradient`

.

This function is *distinct* from `TransformedVariables.dimension`

.

`LogDensityProblems.logdensity`

— Function`logdensity(ℓ, x)`

Evaluate the log density `ℓ`

at `x`

, which has length compatible with its `dimension`

.

Return a real number, which may or may not be finite (can also be `NaN`

). Non-finite values other than `-Inf`

are invalid but do not error, caller should deal with these appropriately.

**Note about constants**

Log densities can be shifted by *the same constant*, as long as it is consistent between calls. For example,

`logdensity(::StandardMultivariateNormal) = -0.5 * sum(abs2, x)`

is a valid implementation for some callable `StandardMultivariateNormal`

that would implement the standard multivariate normal distribution (dimension $k$) with pdf

\[(2\pi)^{-k/2} e^{-x'x/2}\]

`LogDensityProblems.logdensity_and_gradient`

— Function`logdensity_and_gradient(ℓ, x)`

Evaluate the log density `ℓ`

and its gradient at `x`

, which has length compatible with its `dimension`

.

Return two values:

the log density as real number, which equivalent to

`logdensity(ℓ, x)`

*if*the log density is finite, the gradient, an`::AbstractVector`

of real numbers, otherwise this value is arbitrary and should be ignored.

Caller may assume ownership of results, ie that the gradient vector will not be overwritten or reused for a different purpose.

The first argument (the log density) can be shifted by a constant, see the note for `logdensity`

.