BSplineMixture

Documentation for B-Spline Mixture Models.

This model is available through the BayesDensityBSplineMixture package.

BSplineMixture models the data-generating density as a mixture of cubic B-spline basis functions of a fixed, high dimension. The regularity of the estimate is controlled via a smoothness-inducing prior on the spline coefficients. Mathematically, the model can be written as

\[\begin{align*} x_i\,|\, \boldsymbol{\beta} &\sim \sum_{k=1}^K \theta_k b_k(x_i), &i = 1,\ldots, n,\\ \beta_k &\sim \text{Normal}(\mu_k, \sigma_\beta^2), &k = 1, 2,\\ \Delta^2 \big\{\beta_k - \mu_k\big\} \,|\, \tau^2, \boldsymbol{\delta}^2 &\sim \text{Normal}(0, \tau^2\delta_k^2), &k \geq 3\\ \tau^2 &\sim \text{InverseGamma}(a_\tau, b_\tau),\\ \delta^2_k &\sim \text{InverseGamma}(a_\delta, b_\delta), &k = 1, \ldots, K-3,\\ \end{align*}\]

where $b_k(\cdot)$ is a B-spline basis function, normalized to have unit integral, $\boldsymbol{\mu}\in \mathbb{R}^{K-1}, \sigma_\beta, a_\tau, b_\tau, a_\delta, b_\delta > 0$ are fixed hyperparameters, $\Delta^2 \alpha_k= \alpha_k - 2\alpha_{k-1} + \alpha_{k-2}$ is the discrete second-order difference operator and $\boldsymbol{\theta} = \boldsymbol{\theta}(\boldsymbol{\beta})$ is defined by the logistic stickbreaking-map,

\[\begin{align*} \theta_k &= \frac{e^{\beta_k}}{1 + e^{\beta_k}} \prod_{j = 1}^{k-1} \frac{1}{1 + e^{\beta_j}}, &k = 1,\ldots, K-1\\ \theta_K &= 1 - \sum_{k=1}^{K-1} \theta_k. \end{align*}\]

This model is supported on a compact interval, which by default is estimates based on the observed sample. It is also possible to manually specify the support of the model through the bounds keyword argument.

Module API

BayesDensityBSplineMixture.BSplineMixtureType
BSplineMixture{T<:Real} <: AbstractBayesDensityModel{T}

Struct representing a B-spline mixture model.

Constructors

BSplineMixture(x::AbstractVector{<:Real}; kwargs...)
BSplineMixture{T}(x::AbstractVector{<:Real}; kwargs...)

Arguments

  • x: The data vector.

Keyword arguments

  • K: B-spline basis dimension of a regular augmented spline basis. Defaults to max(100, min(200, ⌈n/5⌉))
  • bounds: A tuple giving the support of the B-spline mixture model.
  • n_bins: Lower bound on the number of bins used when fitting the BSplineMixture to data. Binned fitting can be disabled by setting this equal to nothing. The default setting uses unbinned fitting if length(x) ≤ 1200 and 1000 bins otherwise.
  • prior_global_shape: Shape hyperparameter for the global smoothing parameter τ². Defaults to 1.0.
  • prior_global_rate: Rate hyperparameter for the global smoothing parameter τ². Defaults to 1e-3.
  • prior_local_shape: Shape hyperparameter for the local smoothing parameters δₖ². Defaults to 0.5.
  • prior_local_rate: Rate hyperparameter for the local smoothing parameters δₖ². Defaults to 0.5.
  • prior_stdev: Prior standard deviation of the first two unconstrained spline parameters β₁ and β₂. Defaults to 1e5.

Returns

  • bsm: A B-Spline mixture model object.

Examples

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> model = BSplineMixture(x)
200-dimensional BSplineMixture{Float64}:
Using 5000 binned observations on a regular grid consisting of 1187 bins.
 support: (-0.05, 1.05)
Hyperparameters:
 prior_global_shape = 1.0, prior_global_rate = 0.001
 prior_local_shape = 0.5, prior_local_rate = 0.5

julia> model = BSplineMixture(x; K = 150, bounds=(0, 1), n_bins=nothing, prior_global_rate = 5e-3);

Extended help

Binned fitting

To disable binned fitting, one can set n_bins=nothing. Note that the binning is only used as part of the model fitting procedure, and the structure of the resulting fitted model object is the same regardless of whether the binning step is performed or not. Empirically, the results obtained from running the binned and unbinned model fitting procedures tend to be very similar. We therefore recommend using the binned fitting procedure, due to the large improvements in model fitting speed, particularly for larger samples.

For computational reasons, the supplied number of bins is rounded up to the nearest integer such that the bin boundaries overlap with the knots of the spline basis. This is done to ensure that at most 4 cubic splines have positive integrals over each bin.

Hyperparameter selection

The global variance parameter τ2 and the local variance parameters δ2[k] govern the smoothness of the B-spline mixture prior through the centered random walk prior on β | τ2, δ2:

β[k+2] = μ[k+2] + 2 {β[k+1] - μ[k+1]} - {β[k] - μ[k]} + τ * δ[k] * ϵ[k],

where ϵ[k] is standard normal. The first two parameters β[1] and β[2] are assigned diffuse N(0, σ²) priors.

The prior distributions of the local and global smoothing parameters are given by

τ² ∼ InverseGamma(a_τ, b_τ)
δₖ² ∼ InverseGamma(a_δ, b_δ),   1 ≤ k ≤ K-3.

As noninformative defaults, we suggest using prior_global_shape = 1, prior_global_rate = 1e-3, prior_local_shape = 0.5, prior_local_rate = 0.5 and prior_stdev = 1e5. To control the smoothness in the resulting density estimates, we recommend adjusting the value of prior_global_rate while keeping the other hyperparameters fixed. Setting prior_global_rate to a smaller value generally yields smoother curves. Similar priors for regression models suggest that values in the range [5e-5, 5e-3] are reasonable.

source

Evaluating the pdf and cdf

Distributions.pdfMethod
pdf(
    bsm::BSplineMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

pdf(
    bsm::BSplineMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $f(t\, |\, \boldsymbol{\eta})$ for a given BSplineMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain a field named :spline_coefs or .

source
Distributions.cdfMethod
cdf(
    bsm::BSplineMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

cdf(
    bsm::BSplineMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $F(t\, |\, \boldsymbol{\eta})$ for a given BSplineMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain a field named :spline_coefs or .

source

Utility functions

BayesDensityCore.hyperparamsMethod
hyperparams(
    bsm::BSplineMixture{T}
) where {T} -> @NamedTuple{prior_global_shape::T, prior_global_rate::T, prior_local_shape::T, prior_local_rate::T, prior_stdev::T}

Returns the hyperparameters of the B-Spline mixture model bsm as a NamedTuple.

source

Markov chain Monte Carlo

StatsBase.sampleMethod
sample(
    [rng::Random.AbstractRNG],
    bsm::BSplineMixture{T},
    n_samples::Int;
    n_burnin::Int              = min(1000, div(n_samples, 5)),
    initial_params::NamedTuple = get_default_initparams_mcmc(bsm)
) where {T} -> PosteriorSamples{T}

Generate n_samples posterior samples from a BSplineMixture using an augmented Gibbs sampler.

Arguments

  • rng: Optional random seed used for random variate generation.
  • bsm: The BSplineMixture object for which posterior samples are generated.
  • n_samples: The total number of samples (including burn-in).

Keyword arguments

  • n_burnin: Number of burn-in samples.
  • initial_params: Initial values used in the MCMC algorithm. Should be supplied as a NamedTuple with fields and :τ2, where is a K-1-dimensional vector and τ2 is a positive scalar.

Returns

  • ps: A PosteriorSamples object holding the posterior samples and the original model object.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> bsm = BSplineMixture(x);

julia> ps = sample(Xoshiro(1), bsm, 5000);
source

Variational inference

BayesDensityBSplineMixture.BSplineMixtureVIPosteriorType
BSplineMixtureVIPosterior{T<:Real, A<:MvNormalCanon{T}, B<:InverseGamma{T}, M<:BSplineMixture} <: AbstractVIPosterior{T}

Struct representing the variational posterior distribution of a BSplineMixture.

Fields

  • q_β: Distribution representing the optimal variational density q*(β).
  • q_τ: Distribution representing the optimal variational density q*(τ²).
  • q_δ: Product distribution corresponding to the optimal variational density q*(δ²).
  • bsm: The BSplineMixture to which the variational posterior was fit.
source
BayesDensityCore.varinfMethod
varinf(
    bsm::BSplineMixture{T};
    initial_params::NamedTuple = _get_default_initparams(bsm),
    max_iter::Int              = 2000
    rtol::Real                 = 1e-6
) where {T} -> BSplineMixtureVIPosterior{T}

Find a variational approximation to the posterior distribution of a BSplineMixture using mean-field variational inference.

Arguments

  • bsm: The BSplineMixture whose posterior we want to approximate.

Keyword arguments

  • initial_params: Initial values of the VI parameters μ_opt inv_Σ_opt, b_τ_opt and b_δ_opt, supplied as a NamedTuple.
  • max_iter: Maximal number of VI iterations. Defaults to 2000.
  • rtol: Relative tolerance used to determine convergence. Defaults to 1e-6.

Returns

Note

To sample for a fixed number of iterations irrespective of the convergence criterion, one can set rtol = 0.0, and max_iter equal to the desired total iteration count. Note that setting rtol to a strictly negative value will issue a warning.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> bsm = BSplineMixture(x);

julia> vip, info = varinf(bsm);

julia> vip, info = varinf(bsm; rtol=1e-7, max_iter=3000);

Extended help

Convergence

The criterion used to determine convergence is that the relative change in the ELBO falls below the given rtol.

source