RandomFiniteGaussianMixture

Documentation for finite Gaussian mixture models, with a variable (random) number of mixture components.

This model is available through the BayesDensityFiniteGaussianMixture package.

RandomFiniteGaussianMixture models the data-generating density as a mixture of normal distributions with an unknown number of mixture components. Our prior and model-specification follows that of Richardson and Green (1997), and can be described as follows:

\[\begin{align*} x_i\,|\, K, \boldsymbol{w}, \boldsymbol{\mu}, \boldsymbol{\sigma}^2 &\sim \sum_{k=1}^K \frac{w_k}{\sigma_k} \phi\Big(\frac{x_i - \mu_k}{\sigma_k}\Big), &i = 1,\ldots, n,\\ w_k \,|\, K &\sim \mathrm{Dirichlet}_K(\alpha, \ldots, \alpha),\\ \mu_k \,|\, K &\sim \mathrm{Normal}(\mu_0, \sigma_0^2), &k = 1,\ldots, K,\\ \sigma_k^2 \,|\, K, \beta &\sim \mathrm{InverseGamma}(a_\sigma, \beta), &k = 1,\ldots, K,\\ \beta &\sim \mathrm{Gamma}(a_\beta, b_\beta),\\ K &\sim p(K) \end{align*}\]

where $\phi(\cdot)$ denotes the density of the standard normal distribution, $\mu_0 \in \mathbb{R}, \alpha, \sigma_0^2, a_\sigma, a_\beta, b_\beta > 0$ are fixed hyperparameters and $p(K)$ is a probability mass function supported on a finite subset of the positive integers.^[1]

This model is available through the BayesDensityFiniteGaussianMixture package.

For Markov chain Monte Carlo-based inference, we provide an implementation of the telescope sampler (Frühwirth-Schnatter et al., 2021).

The variational inference algorithm used to compute the posterior first proceeds by separately fitting mixture models for different values of $K$, recording the corresponding value of the optimized evidence lower bound, $\mathrm{ELBO}(K)$ at the end of each optimization. The posterior over the number of mixture components $p(K\,|\, \boldsymbol{x})$ is then approximated via

\[q(K) \propto p(K)\,\exp\big\{\mathrm{ELBO}(K)\big\}.\]

This approximation can be justified in light of the fact that the ELBO is a lower bound on the log-marginal likelihood $p(\boldsymbol{x}\,|\, K)$. The approximate posterior for the number of mixture components together with the optimal variational densities given $K$ defines a distribution over a space of mixture of variable dimension, which is then used to make inferences about the density of the given sample.

The algorithm used to compute the conditional variational posterior $q(\boldsymbol{\mu}|k)\,q(\boldsymbol{\sigma}^2|k)\,q(\boldsymbol{w}|k)$ is a variant of the algorithm 5 in Ormerod and Wand (2010). Note that our version also includes an additional hyperprior on the rate parameters of the mixture scales and that the algorithm has been adjusted to account for this fact.

There are two main ways of proceeding with Bayesian inference for the variational posterior. One possibility is to proceed with the single value $\widehat{K}$ that maximizes the variational probability $q(K)$, the so-called maximum a posteriori model. Posterior inference then proceeds via the conditional variational posterior $q\big(\boldsymbol{\mu}, \boldsymbol{\sigma}^2, \boldsymbol{w} | \widehat{K}\big)$. This model can be retrieved by utilizing the maximum_a_posteriori method on a fitted variational posterior, which can then be used for posterior inference.

Another possibility is to take a fully Bayesian approach, where we do not condition on a single value of $K$, but treat it as a random variable. To pursure this approach to posterior inference, one can simply use the object returned by calling varinf directly (e.g. for plotting or computing other posterior summary statistics).

Module API

BayesDensityFiniteGaussianMixture.RandomFiniteGaussianMixture — Type

RandomFiniteGaussianMixture{T<:Real} <: AbstractBayesDensityModel{T}

Struct representing a finite Gaussian mixture model with a variable (random) number of components.

Constructors

RandomFiniteGaussianMixture(x::AbstractVector{<:Real}; kwargs...)
RandomFiniteGaussianMixture{T}(x::AbstractVector{<:Real}; kwargs...)

Arguments

x: The data vector.

Keyword arguments

prior_components: A Distributions.DiscreteNonParametric distribution instance specifying the prior on the number of components K. Defaults to DiscreteNonParametric(1:50, fill(T(1/50), 50)), corresponding to a uniform prior on the set {1, …, 50}.
prior_strength: Strength parameter of the symmetric Dirichlet prior on the mixture weights. E.g. the prior is Dirichlet(strength, ..., strength). Defaults to 1.0.
prior_location: Prior mean of the location parameters μ[k]. Defaults to the midpoint of the minimum and maximum values in the sample.
prior_variance: The prior variance of the location parameter μ[k]. Defaults to the sample range.
prior_shape: Prior shape parameter of the squared scale parameters σ2[k]: Defaults to 2.0.
hyperprior_shape: Prior shape parameter of the hyperprior on the rate parameter of σ2[k]. Defaults to 0.2.
hyperprior_rate: Prior rate parameter of the hyperprior on the rate parameter of σ2[k]. Defaults to 0.2*R^2, where R is the sample range.

Examples

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> rfgm = RandomFiniteGaussianMixture(x)
RandomFiniteGaussianMixture{Float64} with 20 values for the number mixture components.
Using 5000 observations.
Hyperparameters:
 prior_location = 0.5, prior_variance = 1.0
 prior_shape = 2.0, hyperprior_shape = 0.2, hyperprior_rate = 10.0
 prior_strength = 1.0

julia> rfgm = RandomFiniteGaussianMixture(x; prior_components = DiscreteNonParametric(1:12, fill(1/12, 12)));

source

Evaluating the pdf and cdf

Distributions.pdf — Method

pdf(
    bsm::RandomFiniteGaussianMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

pdf(
    bsm::RandomFiniteGaussianMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $f(t\, |\, \boldsymbol{\eta})$ for a given RandomFiniteGaussianMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named :μ, :σ2, :w and optionally :β.

source

Distributions.cdf — Method

cdf(
    bsm::RandomFiniteGaussianMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

cdf(
    bsm::RandomFiniteGaussianMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $F(t\, |\, \boldsymbol{\eta})$ for a given RandomFiniteGaussianMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named :μ, :σ2, :w and optionally :β.

source

Utility functions

BayesDensityCore.hyperparams — Method

hyperparams(
    gm::RandomFiniteGaussianMixture{T}
) where {T} -> @NamedTuple{prior_components::DiscreteNonParametric{Int, T}, prior_strength::T, prior_location::T, prior_variance::T, prior_shape::T, prior_rate::T}

Returns the hyperparameters of the finite Gaussian mixture model gm as a NamedTuple.

source

Markov chain Monte Carlo

StatsBase.sample — Method

sample(
    [rng::Random.AbstractRNG],
    rfgm::RandomFiniteGaussianMixture{T},
    n_samples::Int;
    n_burnin::Int              = min(1000, div(n_samples, 5)),
    initial_params::NamedTuple = _get_default_initparams_mcmc(rfgm)
) where {T} -> PosteriorSamples{T}

Generate n_samples posterior samples from a RandomFiniteGaussianMixture using the telescope sampler.

Arguments

rng: Optional random seed used for random variate generation.
rfgm: The RandomFiniteGaussianMixture object for which posterior samples are generated.
n_samples: The total number of samples (including burn-in).

Keyword arguments

n_burnin: Number of burn-in samples.
initial_params: Initial values used in the MCMC algorithm. Should be supplied as a NamedTuple with fields :μ, :σ2 and :w, where all are K-dimensional vectors.

The following constraints must also be satisfied: σ2[k]>0 for all k and w[k]≥0 for all k and sum(w) ≈ 1

Returns

ps: A PosteriorSamples object holding the posterior samples and the original model object.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> rfgm = RandomFiniteGaussianMixture(x);

julia> ps1 = sample(rfgm, 5_000);

julia> ps2 = sample(rfgm, 5_000; n_burnin=2_000, initial_params = (μ = [0.2, 0.8], σ2 = [1.0, 2.0], w = [0.7, 0.3]));

source

Variational inference

BayesDensityFiniteGaussianMixture.RandomFiniteGaussianMixtureVIPosterior — Type

RandomFiniteGaussianMixtureVIPosterior{T<:Real} <: AbstractVIPosterior{T}

Struct representing the variational posterior distribution of a RandomFiniteGaussianMixture.

Fields

posterior_components: The posterior probabilities on the number of components. Note that support(posterior_components)[K] corresponds to the posterior probability of model K.
mixture_fits: Vector of FiniteGaussianMixtureVIPosterior objects, containing the fitted variational posterior distributions for differing values of mixture components.
rgfm: The RandomFiniteGaussianMixture to which the variational posterior was fit.

source

BayesDensityCore.varinf — Method

varinf(
    rfgm::RandomFiniteGaussianMixture{T};
    max_iter::Int = 2000
    rtol::Real    = 1e-6
) where {T} -> PitmanYorMixtureVIPosterior{T}

Find a variational approximation to the posterior distribution of a RandomFiniteGaussianMixture using mean-field variational inference.

Arguments

rfgm: The RandomFiniteGaussianMixture whose posterior we want to approximate.

Keyword arguments

max_iter: Maximal number of VI iterations. Defaults to 2000.
rtol: Relative tolerance used to determine convergence. Defaults to 1e-6.

Returns

vip: A RandomFiniteGaussianMixtureVIPosterior object representing the variational posterior.
info: A VariationalOptimizationResult describing the result of the optimization.

Note

To perform the optimization for a fixed number of iterations irrespective of the convergence criterion, one can set rtol = 0.0, and max_iter equal to the desired total iteration count. Note that setting rtol to a strictly negative value will issue a warning.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> rfgm = RandomFiniteGaussianMixture(x);

julia> vip = varinf(rfgm);

julia> vip = varinf(rfgm; rtol=1e-7, max_iter=3000);

source

BayesDensityFiniteGaussianMixture.posterior_components — Function

posterior_components(vip::RandomFiniteGaussianMixtureVIPosterior{T}) where {T} -> DiscreteNonParametric{Int, T}

Get the variational posterior probability mass function of the number of mixture components as a DiscreteNonParametric instance.

source

BayesDensityFiniteGaussianMixture.maximum_a_posteriori — Function

maximum_a_posteriori(
    vip::RandomFiniteGaussianMixtureVIPosterior{T}
) where {T} -> FiniteGaussianMixtureVIPosterior{T}

Get the variational posterior distribution that maximizes the approximate posterior probability on the number of components q(K).

source

1We use the rate parameterization of the Gamma distribution here. This differs from the scale-parameterization used by Distributions.