Introduction

Some problems, especially in numerical integration and Markov Chain Monte Carlo, benefit from transformation of variables: for example, if $σ > 0$ is a standard deviation parameter, it is usually better to work with log(σ) which can take any value on the real line. However, in general such transformations require correcting density functions by the determinant of their Jacobian matrix, usually referred to as "the Jacobian".

Also, is usually easier to code MCMC algorithms to work with vectors of real numbers, which may represent a "flattened" version of parameters, and would need to be decomposed into individual parameters, which themselves may be arrays, tuples, or special objects like lower triangular matrices.

This package is designed to help with both of these use cases. For example, consider the "8 schools" problem from Chapter 5.5 of Gelman et al (2013), in which SAT scores $y_{ij}$ in $J=8$ schools are modeled using a conditional normal

\[y_{ij} ∼ N(θⱼ, σ²)\]

and the $θⱼ$ are assume to have a hierarchical prior distribution

\[θⱼ ∼ N(μ, τ²)\]

For this problem, one could define a transformation

using TransformVariables
t = as((μ = asℝ, σ = asℝ₊, τ = asℝ₊, θs = as(Array, 8)))
dimension(t)

which would then yield a NamedTuple with the given names, with one of them being a Vector:

julia> x = randn(dimension(t))11-element Vector{Float64}:
  0.06193274031408013
  0.2784058141640002
 -0.5958244153640522
  0.04665938957338174
  1.0857940215432762
 -1.5765649225859841
  0.1759399913010747
  0.8653808054093252
  0.972024394360624
  1.546409924955377
 -0.5841980481085709
julia> y = transform(t, x)(μ = 0.06193274031408013, σ = 1.3210221779582236, τ = 0.5511080366004918, θs = [0.04665938957338174, 1.0857940215432762, -1.5765649225859841, 0.1759399913010747, 0.8653808054093252, 0.972024394360624, 1.546409924955377, -0.5841980481085709])
julia> keys(y)(:μ, :σ, :τ, :θs)
julia> y.θs8-element Vector{Float64}:
  0.04665938957338174
  1.0857940215432762
 -1.5765649225859841
  0.1759399913010747
  0.8653808054093252
  0.972024394360624
  1.546409924955377
 -0.5841980481085709

Further worked examples of using this package can be found in the DynamicHMCExamples.jl repository. It is recommended that the user reads those first, and treats this documentation as a reference.

General interface

Transformations

TransformVariables.dimension — Function

dimension(t::AbstractTransform)

The dimension (number of elements) that t transforms.

Types should implement this method.

TransformVariables.transform — Function

transform(t, x)

Transform x using t.

transform(t)

Return a callable equivalent to x -> transform(t, x) that transforms its argument:

transform(t, x) == transform(t)(x)

TransformVariables.transform_and_logjac — Function

transform_and_logjac(t, x)

Transform x using t; calculating the log Jacobian determinant, returned as the second value.

Inverses

TransformVariables.inverse — Function

inverse(t, y)

Return x so that transform(t, x) ≈ y.

inverse(t)

Return a callable equivalent to y -> inverse(t, y). t can also be a callable created with transform, so the following holds:

inverse(t)(y) == inverse(t, y) == inverse(transform(t))(y)

Note that eltype(inverse(t, y)) ≡ inverse_eltype(t, typeof(y)) holds. See inverse_eltype.

TransformVariables.inverse! — Function

inverse!(x, transformation, y)

Put inverse(t, y) into a preallocated vector x, returning x.

Generalized indexing should be assumed on x.

See inverse_eltype for determining the type of x.

TransformVariables.inverse_eltype — Function

inverse_eltype(t::AbstractTransform, y)
inverse_eltype(t::AbstractTransform, ::Type{T})

The element type for vector x so that inverse!(x, t, y::T) works.

Notes

It is not guaranteed that the result is the narrowest possible type, and may change without warning between versions. Some effort is made to come up with a reasonable concrete type even in corner cases.
Transformations should provide a method for types, not values.
No dimension or input compatibility checks are guaranteed to be performed, even for values.

The value is always a numerical type (ie <:Real), but in corner cases (eg transformations of zero dimension) may not be a float.

Integration into Bayesian inference

TransformVariables.transform_logdensity — Function

transform_logdensity(t, f, x)

Let $y = t(x)$, and $f(y)$ a log density at y. This function evaluates f ∘ t as a log density, taking care of the log Jacobian correction.

TransformVariables.logprior — Function

logprior(t, y)

Return the log prior correction used in transform_and_logjac. The second argument is the output of a transformation.

The log jacobian determinant is corrected by this value, usually for the purpose of making a distribution proper. Can only be nonzero when nonzero_logprior is true.

TransformVariables.nonzero_logprior — Function

nonzero_logprior(t)

Return true only if there are potential inputs for which logprior is nonzero.

Note

Currently the only transformation that has a log prior correction is unit_vector_norm.

Miscellaneous

TransformVariables.domain_label — Function

domain_label(transformation, index)

Return a string that can be used to for identifying a coordinate. Mainly for debugging and generating graphs and data summaries.

Transformations may provide a heuristic label.

Transformations should implement _domain_label.

Example

julia> t = as((a = asℝ₊,
            b = as(Array, asℝ₋, 1, 1),
            c = corr_cholesky_factor(2)));

julia> [domain_label(t, i) for i in 1:dimension(t)]
3-element Vector{String}:
 ".a"
 ".b[1,1]"
 ".c[1]"

Defining transformations

The `as` constructor and aggregations

Some transformations, particularly aggregations use the function as as the constructor. Aggregating transformations are built from other transformations to transform consecutive (blocks of) real numbers into the desired domain.

It is recommended that you use as(Array, ...) and friends (as(Vector, ...), as(Matrix, ...)) for repeating the same transformation, and named tuples such as as((μ = ..., σ = ...)) for transforming into named parameters. For extracting parameters in log likelihoods, consider Parameters.jl.

See methods(as) for all the constructors, ?as for their documentation.

TransformVariables.as — Function

as(T, args...)

Shorthand for constructing transformations with image in T. args determines or modifies behavior, details depend on T.

Not all transformations have an as method, some just have direct constructors. See methods(as) for a list.

Examples

as(Real, -∞, 1)          # transform a real number to (-∞, 1)
as(Array, 10, 2)         # reshape 20 real numbers to a 10x2 matrix
as(Array, as𝕀, 10)       # transform 10 real numbers to (0, 1)
as((a = asℝ₊, b = as𝕀)) # transform 2 real numbers a NamedTuple, with a > 0, 0 < b < 1
as(SArray{1,2,3}, as𝕀)  # transform to a static array of positive numbers

Transforms which produce NamedTuples can be merged, which internally calls Base.merge; name collisions will thus follow Base behavior, which is that the right-most instance will be kept. When using e.g. ConstructionBase.setproperties to map a vector onto a subset of parameters stored in a struct, this functionality allows transforms for different parameter subsets to be constructed for use separately or together:

t_a = as((;a = asℝ₊))
t_b = as((;b = as𝕀))
t_c = as((;c = TVShift(5) ∘ TVExp()))
t_ab = merge(t_a, t_b)
t_abc = merge(t_ab, t_c)
t_abc = merge(t_a, t_b, t_c)
t_collision = merge(t_a, as((;a = asℝ₋))) # Will have a = asℝ₋, from rightmost

Scalar transforms

The symbol ∞ is a placeholder for infinity. It does not correspond to Inf, but acts as a placeholder for the correct dispatch. -∞ is valid.

TransformVariables.∞ — Constant

Placeholder representing of infinity for specifing interval boundaries. Supports the - operator, ie -∞.

as(Real, a, b) defines transformations to finite and (semi-)infinite subsets of the real line, where a and b can be -∞ and ∞, respectively.

TransformVariables.as — Method

as(Real, left, right)

Return a transformation that transforms a single real number to the given (open) interval.

left < right is required, but may be -∞ or ∞, respectively, in which case the appropriate transformation is selected. See ∞.

Some common transformations are predefined as constants, see asℝ, asℝ₋, asℝ₊, as𝕀.

Note

The finite arguments are promoted to a common type and affect promotion. Eg transform(as(0, ∞), 0f0) isa Float32, but transform(as(0.0, ∞), 0f0) isa Float64.

The following constants are defined for common cases.

TransformVariables.asℝ — Constant

Transform to the real line (identity). See as.

asℝ and as_real are equivalent alternatives.

TransformVariables.asℝ₊ — Constant

Transform to a positive real number. See as.

asℝ₊ and as_positive_real are equivalent alternatives.

TransformVariables.asℝ₋ — Constant

Transform to a negative real number. See as.

asℝ₋ and as_negative_real are equivalent alternatives.

TransformVariables.as𝕀 — Constant

Transform to the unit interval (0, 1). See as.

as𝕀 and as_unit_interval are equivalent alternatives.

For more granular control than the as(Real, a, b), scalar transformations can be built from individual elements with the composition operator ∘ (typed as \circ<tab>):

TransformVariables.TVExp — Type

struct TVExp <: TransformVariables.ScalarTransform

Exponential transformation x ↦ eˣ. Maps from all reals to the positive reals.

TransformVariables.TVLogistic — Type

struct TVLogistic <: TransformVariables.ScalarTransform

Logistic transformation x ↦ logit(x). Maps from all reals to (0, 1).

TransformVariables.TVScale — Type

struct TVScale{T} <: TransformVariables.ScalarTransform

Scale transformation x ↦ scale * x.

TransformVariables.TVShift — Type

struct TVShift{T<:Real} <: TransformVariables.ScalarTransform

Shift transformation x ↦ x + shift.

TransformVariables.TVNeg — Type

struct TVNeg <: TransformVariables.ScalarTransform

Negative transformation x ↦ -x.

Consistent with common notation, transforms are applied right-to-left; for example, as(Real, ∞, 3) is equivalent to TVShift(3) ∘ TVNeg() ∘ TVExp(). If you are working in an editor where typing Unicode is difficult, TransformVariables.compose is also available, as in TransformVariables.compose(TVScale(5.0), TVNeg(), TVExp()).

This composition works with any scalar transform in any order, so TVScale(4) ∘ as(Real, 2, ∞) ∘ TVShift(1e3) is a valid transform. This is useful especially for making sure that values near 0, when transformed, yield usefully-scaled values for a given variable.

In addition, the TVScale transform accepts arbitrary types. It can be used as the outermost transform (so leftmost in the composition) to add, for example, Unitful units to a number (or to create other exotic number types which can be constructed by multiplying, such as a ForwardDiff.Dual).

However, note that calculating log Jacobian determinants may error for types that are not real numbers. For example,

using Unitful
t = TVScale(5u"m") ∘ TVExp()

produces positive quantities with the dimension of length.

Note

Because the log-Jacobian of a transform that adds units is not defined, transform_and_logjac and inverse_and_logjac only have methods defined for TVScale{T} where {T<:Real}.

Note

The inverse transform of TVScale(scale) divides by scale, which is the correct inverse for adding units to a number, but may be inappropriate for other custom number types. A transform that doesn't just multiply or an inverse that extracts a float from an exotic number type could be defined by adding methods to transform and inverse like the following: transform(t::TVScale{T}, x) where T<:MyCustomNumberType = MyCustomNumberType(x) inverse(t::TVScale{T}, x) where T<:MyCustomNumberType = get_the_float_part(x)

Special arrays

TransformVariables.unit_vector_norm — Function

unit_vector_norm(n; chi_prior)

Transform n ≥ 2 real numbers to a unit vector of length n and a radius, under the Euclidean norm. Returns the tuple (normalized_vector, radius).

When chi_prior = true, a prior correction is applied to the radius, which only affects the log Jacobian determinant. The purpose of this is to make the distribution proper. If you wish to use another prior, set this to false and use a manual correction, see also logprior.

Note

At the origin, this transform is non-bijective and non-differentiable. If maximizing a target distribution whose density is constant for the unit vector, then the maximizer using the Chi prior is at the origin, and behavior is undefined.

Note

While $n = 1$ would be technically possible, for practical purposes it would likely suffer from numerical issues, since the transform is undefined at $x = 0$, and for a Markov chain to travel from $y=[-1]$ to $y=[1]$, it would have to leap over the origin, which is only even possible due to discretization and likely will often not work. Because of this, it is disallowed.

TransformVariables.UnitVector — Type

UnitVector(n)

Transform n-1 real numbers to a unit vector of length n, under the Euclidean norm.

TransformVariables.UnitSimplex — Type

UnitSimplex(n)

Transform n-1 real numbers to a vector of length n whose elements are non-negative and sum to one.

TransformVariables.CorrCholeskyFactor — Type

CorrCholeskyFactor(n)

Note

It is better style to use corr_cholesky_factor, this will be deprecated.

Cholesky factor of a correlation matrix of size n.

Transforms $n×(n-1)/2$ real numbers to an $n×n$ upper-triangular matrix U, such that U'*U is a correlation matrix (positive definite, with unit diagonal).

Notes

If

z is a vector of n IID standard normal variates,
σ is an n-element vector of standard deviations,
U is obtained from CorrCholeskyFactor(n),

then Diagonal(σ) * U' * z will be a multivariate normal with the given variances and correlation matrix U' * U.

TransformVariables.corr_cholesky_factor — Function

corr_cholesky_factor(n)

Transform into a Cholesky factor of a correlation matrix.

If the argument is a (positive) integer n, it determines the size of the output n × n, resulting in a Matrix.

If the argument is SMatrix{N,N}, an SMatrix is produced.

Miscellaneous transformations

TransformVariables.Constant — Type

Constant(value)

Placeholder for inserting a constant. Inverse checks equality with ==.

Defining custom transformations

TransformVariables.logjac_forwarddiff — Function

logjac_forwarddiff(f, x; handleNaN, chunk, cfg)

Calculate the log Jacobian determinant of f at x using ForwardDiff.

Note

f should be a bijection, mapping from vectors of real numbers to vectors of equal length.

When handleNaN = true (the default), NaN log Jacobians are converted to -Inf.

TransformVariables.value_and_logjac_forwarddiff — Function

value_and_logjac_forwarddiff(
    f,
    x;
    flatten,
    handleNaN,
    chunk,
    cfg
)

Calculate the value and the log Jacobian determinant of f at x. flatten is used to get a vector out of the result that makes f a bijection.

TransformVariables.CustomTransform — Type

CustomTransform(g, f, flatten; chunk, cfg)

Wrap a custom transform y = f(transform(g, x)) in a type that calculates the log Jacobian of $∂y/∂x$ using ForwardDiff when necessary.

Usually, g::TransformReals, but when an integer is used, it amounts to the identity transformation with that dimension.

flatten should take the result from f, and return a flat vector with no redundant elements, so that $x ↦ y$ is a bijection. For example, for a covariance matrix the elements below the diagonal should be removed.

chunk and cfg can be used to configure ForwardDiff.JacobianConfig. cfg is used directly, while chunk = ForwardDiff.Chunk{N}() can be used to obtain a type-stable configuration.