RandomBernsteinPoly

Documentation for random Bernstein polynomials (Petrone, 1999).

RandomBernsteinPoly models the data-generating density as a mixture of fixed $\text{Beta}$ distributions with an unknown number of mixture components. The smoothness of the density estiates is primarily controlled through the number of basis densities $K$, which is selected in a data-driven manner. Mathematically, the Bernstein density model and prior distribution takes the following form,

\[\begin{align*} x_i \,|\, K, \boldsymbol{w} &\sim \sum_{k=1}^K w_{k}b_{K,k}(x_i), &i = 1,\ldots, n,\\ \boldsymbol{w} \,|\, K &\sim \text{Dirichlet}_K(a/K, \ldots, a/K),\\ K &\sim p(K), \end{align*}\]

where $b_{K,k}(\cdot)$ denotes the probability density function of the $\text{Beta}(k, K-k+1)$-distribution, $a> 0$ a fixed hyperparameter and $p(K)$ is a probability mass function supported on a finite subset of the positive integers.

Although this model is formulated in terms of data on the unit interval, it can be also be applied more broadly to data on the entire real line. By default, we estimate the support of the Bernstein density based on the extrema of the data set, but it is also possible to use a prespecified support via the bounds keyword argument.

This module implements the Gibbs sampler of Petrone (1999) for Markov chain Monte Carlo-based posterior inference.

Note

Keep in mind that this sampler is quite slow even for moderate sample sizes, due to the expensive sweeps involved in the Gibbs sampler.

Module API

BayesDensityRandomBernsteinPoly.RandomBernsteinPolyType
RandomBernsteinPoly{T<:Real} <: AbstractBayesDensityModel{T}

Struct representing a random Bernstein polyomial.

Constructors

RandomBernsteinPoly(x::AbstractVector{<:Real}; kwargs...)
RandomBernsteinPoly{T}(x::AbstractVector{<:Real}; kwargs...)

Arguments

  • x: The data vector.

Keyword arguments

  • prior_components: A Distributions.DiscreteNonParametric distribution instance specifying the prior on the number of components K. Defaults to DiscreteNonParametric(1:50, fill(T(1/50), 50)), corresponding to a uniform prior on the set {1, …, 50}.
  • prior_strength: Strength parameter of the symmetric Dirichlet prior on the mixture weights. E.g. the prior on the mixture weights is Dirichlet(strength/K, ..., strength/K) conditional on the number of mixture components being equal to K. Defaults to 1.0.
  • bounds: A tuple giving the support of the B-spline mixture model.

Returns

  • rbp: A random Bernstein polynomial model object.

Examples

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> rbp = RandomBernsteinPoly(x)
RandomBernsteinPoly{Float64} with 100 values for the number mixture components.
Using 5000 observations.
Hyperparameters:
 prior_strength = 1.0

julia> fgm = RandomBernsteinPoly(x; prior_strength = 0.5, prior_components = DiscreteNonParametric(1:25, fill(1/25, 25)));
source

Evaluating the pdf and cdf

Distributions.pdfMethod
pdf(
    bsm::RandomBernsteinPoly,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

pdf(
    bsm::RandomBernsteinPoly,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $f(t\, |\, \boldsymbol{\eta})$ for a given RandomBernsteinPoly when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named :w.

source
Distributions.cdfMethod
cdf(
    bsm::RandomBernsteinPoly,
    params::NamedTuple,
    t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}

cdf(
    bsm::RandomBernsteinPoly,
    params::AbstractVector{NamedTuple},
    t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}

Evaluate $F(t\, |\, \boldsymbol{\eta})$ for a given RandomBernsteinPoly when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.

The named tuple should contain fields named :w.

source

Utility functions

BayesDensityCore.hyperparamsMethod
hyperparams(
    rbp::RandomBernsteinPoly{T}
) where {T} -> @NamedTuple{prior_components::DiscreteNonParametric{Int, T}, prior_strength::T}

Returns the hyperparameters of the random Bernstein polynomial rbp as a NamedTuple.

source

Markov chain Monte Carlo

StatsBase.sampleMethod
sample(
    [rng::Random.AbstractRNG],
    rbp::RandomBernsteinPoly{T},
    n_samples::Int;
    n_burnin::Int              = min(1000, div(n_samples, 5)),
    initial_params::NamedTuple = _get_default_initparams_mcmc(rbp)
) where {T} -> PosteriorSamples{T}

Generate n_samples posterior samples from a RandomBernsteinPoly using the telescope sampler.

Arguments

  • rng: Optional random seed used for random variate generation.
  • rbp: The RandomBernsteinPoly object for which posterior samples are generated.
  • n_samples: The total number of samples (including burn-in).

Keyword arguments

  • n_burnin: Number of burn-in samples.
  • initial_params: Initial values used in the MCMC algorithm. Should be supplied as a NamedTuple with a single field K, where K is a positive integer. Defaults to the integer nearest to 0.5*sqrt(n) which has positive prior probability, where n is the sample size.

Returns

  • ps: A PosteriorSamples object holding the posterior samples and the original model object.

Examples

julia> using Random

julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);

julia> rbp = RandomBernsteinPoly(x);

julia> ps1 = sample(rbp, 5_000);

julia> ps2 = sample(rbp, 5_000; n_burnin=2_000, initial_params = (K = 5,));
source