FiniteGaussianMixture

Documentation for finite Gaussian mixture models, with a fixed number of mixture components.

This model is available through the BayesDensityFiniteGaussianMixture package.

FiniteGaussianMixture models the data-generating density as a mixture of normal distributions with a known number of mixture components. The prior-model specification can be written as follows:

\[\begin{align*} x_i\,|\, \boldsymbol{w}, \boldsymbol{\mu}, \boldsymbol{\sigma}^2 &\sim \sum_{k=1}^K \frac{w_k}{\sigma_k} \phi\Big(\frac{x_i - \mu_k}{\sigma_k}\Big), &i = 1,\ldots, n,\\ w_k &\sim \mathrm{Dirichlet}_K(\alpha, \ldots, \alpha),\\ \mu_k &\sim \mathrm{Normal}(\mu_0, \sigma_0^2), &k = 1,\ldots, K,\\ \sigma_k^2 \,|\, \beta &\sim \mathrm{InverseGamma}(a_\sigma, \beta), &k = 1,\ldots, K,\\ \beta &\sim \mathrm{Gamma}(a_\beta, b_\beta),\\ \end{align*}\]

where $\phi(\cdot)$ denotes the density of the standard normal distribution, $\mu_0 \in \mathbb{R}, \alpha, \sigma_0^2, a_\sigma, a_\beta, b_\beta > 0$ are fixed hyperparameters and $K$ is a fixed positive integer.[1]

Note

When using FiniteGaussianMixture, the number of mixture components $K$ must be specified by the user. A more flexible version of this model, where $K$ is further treated as a random variable with its own prior distribution, is available as RandomFiniteGaussianMixture.

For Markov chain Monte Carlo based inference, this module implements an augmented Gibbs sampling approach. The algorithm used is essentially the Gibbs sampler sweep (excluding the reversible jump-move) of Richardson and Green (1997). For variational inference, we implement a variant of algorithm 5 in Ormerod and Wand (2010). Note that our version also includes an additional hyperprior on the rate parameters of the mixture scales and that the algorithm has been adjusted to account for this fact.

Module API

BayesDensityFiniteGaussianMixture.FiniteGaussianMixtureType
FiniteGaussianMixture{T<:Real} <: AbstractBayesDensityModel{T}

Struct representing a finite Gaussian mixture model with a fixed number of components.

Constructors

FiniteGaussianMixture(x::AbstractVector{<:Real}, K::Int; kwargs...)
FiniteGaussianMixture{T}(x::AbstractVector{<:Real}, K::Int; kwargs...)

Arguments

  • x: The data vector.
  • K: The number of mixture components.

Keyword arguments

  • prior_strength: Strength parameter of the symmetric Dirichlet prior on the mixture weights. E.g. the prior is Dirichlet(strength, ..., strength). Defaults to 1.0.
  • prior_location: Prior mean of the location parameters μ[k]. Defaults to the midpoint of the minimum and maximum values in the sample.
  • prior_variance: The prior variance of the location parameter μ[k]. Defaults to the sample range.
  • prior_shape: Prior shape parameter of the squared scale parameters σ2[k]: Defaults to 2.0.
  • hyperprior_shape: Prior shape parameter of the hyperprior on the rate parameter of σ2[k]. Defaults to 0.2.
  • hyperprior_rate: Prior rate parameter of the hyperprior on the rate parameter of σ2[k]. Defaults to 0.2*R^2, where R is the sample range.

Examples

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> fgm = FiniteGaussianMixture(x, 5)
FiniteGaussianMixture{Float64} with 5 components.
Using 5000 observations.
Hyperparameters:
 prior_location = 0.5, prior_variance = 1.0
 prior_shape = 2.0, hyperprior_shape = 0.2, hyperprior_rate = 10.0
 prior_strength = 1.0

julia> fgm = FiniteGaussianMixture(x; prior_strength=10);
source

Evaluating the pdf and cdf

Distributions.pdfMethod
pdf(
    bsm::FiniteGaussianMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

pdf(
    bsm::FiniteGaussianMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $f(t\, |\, \boldsymbol{\eta})$ for a given FiniteGaussianMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named , :σ2, :w and optionally .

source
Distributions.cdfMethod
cdf(
    bsm::FiniteGaussianMixture,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

cdf(
    bsm::FiniteGaussianMixture,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $F(t\, |\, \boldsymbol{\eta})$ for a given FiniteGaussianMixture when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named , :σ2, :w and optionally .

source

Utility functions

BayesDensityCore.hyperparamsMethod
hyperparams(
    gm::FiniteGaussianMixture{T}
) where {T} -> @NamedTuple{prior_strength::T, prior_location::T, prior_variance::T, prior_shape::T, prior_rate::T}

Returns the hyperparameters of the finite Gaussian mixture model gm as a NamedTuple.

source

Markov chain Monte Carlo

StatsBase.sampleMethod
sample(
    [rng::Random.AbstractRNG],
    fgm::FiniteGaussianMixture{T},
    n_samples::Int;
    n_burnin::Int              = min(1000, div(n_samples, 5)),
    initial_params::NamedTuple = _get_default_initparams_mcmc(rfgm)
) where {T} -> PosteriorSamples{T}

Generate n_samples posterior samples from a FiniteGaussianMixture using an augmented Gibbs sampler.

Arguments

  • rng: Optional random seed used for random variate generation.
  • fgm: The FiniteGaussianMixture object for which posterior samples are generated.
  • n_samples: The total number of samples (including burn-in).

Keyword arguments

  • n_burnin: Number of burn-in samples.
  • initial_params: Initial values used in the MCMC algorithm. Should be supplied as a NamedTuple with fields , :σ2 and :w, where all are K-dimensional vectors.

The following constraints must also be satisfied: σ2[k]>0 for all k and w[k]≥0 for all k and sum(w) ≈ 1

Returns

  • ps: A PosteriorSamples object holding the posterior samples and the original model object.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> fgm = FiniteGaussianMixture(x, 4);

julia> ps1 = sample(fgm, 5_000);

julia> ps2 = sample(fgm, 5_000; n_burnin=2_000, initial_params = (μ = [0.2, 0.8], σ2 = [1.0, 2.0], w = [0.7, 0.3]));
source

Variational inference

BayesDensityFiniteGaussianMixture.FiniteGaussianMixtureVIPosteriorType
FiniteGaussianMixtureVIPosterior{T<:Real} <: AbstractVIPosterior{T}

Struct representing the variational posterior distribution of a FiniteGaussianMixture.

Fields

  • q_w: Distribution representing the optimal variational densities of the component weights q*(w|K).
  • q_μ: Product distribution representing the optimal variational densitiy of the component means q*(μ|K).
  • q_σ2: Product distribution representing the optimal variational densitiy of the component variances q*(σ2|K).
  • q_β: The optimal variational density q*(β|k) of the rate hyperparameter of the component variance σ2[k].
  • fgm: The FiniteGaussianMixture to which the variational posterior was fit.
source
BayesDensityCore.varinfMethod
varinf(
    fgm::FiniteGaussianMixture{T};
    initial_params::NamedTuple = _get_default_initparams(x),
    max_iter::Int              = 2000
    rtol::Real                 = 1e-6
) where {T} -> PitmanYorMixtureVIPosterior{T}

Find a variational approximation to the posterior distribution of a FiniteGaussianMixture using mean-field variational inference.

Arguments

  • fgm: The FiniteGaussianMixture whose posterior we want to approximate.

Keyword arguments

  • initial_params: Initial values of the VI parameters dirichlet_params location_params, variance_params, shape_params and rate_params, supplied as a NamedTuple.
  • max_iter: Maximal number of VI iterations. Defaults to 1000.
  • rtol: Relative tolerance used to determine convergence. Defaults to 1e-6.

Returns

Note

To perform the optimization for a fixed number of iterations irrespective of the convergence criterion, one can set rtol = 0.0, and max_iter equal to the desired total iteration count. Note that setting rtol to a strictly negative value will issue a warning.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> fgm = FiniteGaussianMixture(x, 10);

julia> vip, info = varinf(fgm);

julia> vip, info = varinf(fgm; rtol=1e-7, max_iter=3000);

Extended help

Convergence

The criterion used to determine convergence is that the relative change in the ELBO falls below the given rtol.

source
  • 1We use the rate parameterization of the Gamma distribution here. This differs from the scale-parameterization used by Distributions.