RandomBernsteinPoly
Documentation for random Bernstein polynomials (Petrone, 1999).
RandomBernsteinPoly models the data-generating density as a mixture of fixed $\text{Beta}$ distributions with an unknown number of mixture components. The smoothness of the density estiates is primarily controlled through the number of basis densities $K$, which is selected in a data-driven manner. Mathematically, the Bernstein density model and prior distribution takes the following form,
\[\begin{align*} x_i \,|\, K, \boldsymbol{w} &\sim \sum_{k=1}^K w_{k}b_{K,k}(x_i), &i = 1,\ldots, n,\\ \boldsymbol{w} \,|\, K &\sim \text{Dirichlet}_K(a/K, \ldots, a/K),\\ K &\sim p(K), \end{align*}\]
where $b_{K,k}(\cdot)$ denotes the probability density function of the $\text{Beta}(k, K-k+1)$-distribution, $a> 0$ a fixed hyperparameter and $p(K)$ is a probability mass function supported on a finite subset of the positive integers.
Although this model is formulated in terms of data on the unit interval, it can be also be applied more broadly to data on the entire real line. By default, we estimate the support of the Bernstein density based on the extrema of the data set, but it is also possible to use a prespecified support via the bounds keyword argument.
This module implements the Gibbs sampler of Petrone (1999) for Markov chain Monte Carlo-based posterior inference.
Keep in mind that this sampler is quite slow even for moderate sample sizes, due to the expensive sweeps involved in the Gibbs sampler.
Module API
BayesDensityRandomBernsteinPoly.RandomBernsteinPoly — Type
RandomBernsteinPoly{T<:Real} <: AbstractBayesDensityModel{T}Struct representing a random Bernstein polyomial.
Constructors
RandomBernsteinPoly(x::AbstractVector{<:Real}; kwargs...)
RandomBernsteinPoly{T}(x::AbstractVector{<:Real}; kwargs...)Arguments
x: The data vector.
Keyword arguments
prior_components: ADistributions.DiscreteNonParametricdistribution instance specifying the prior on the number of componentsK. Defaults toDiscreteNonParametric(1:50, fill(T(1/50), 50)), corresponding to a uniform prior on the set {1, …, 50}.prior_strength: Strength parameter of the symmetric Dirichlet prior on the mixture weights. E.g. the prior on the mixture weights is Dirichlet(strength/K, ..., strength/K) conditional on the number of mixture components being equal toK. Defaults to1.0.bounds: A tuple giving the support of the B-spline mixture model.
Returns
rbp: A random Bernstein polynomial model object.
Examples
julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);
julia> rbp = RandomBernsteinPoly(x)
RandomBernsteinPoly{Float64} with 100 values for the number mixture components.
Using 5000 observations.
Hyperparameters:
prior_strength = 1.0
julia> fgm = RandomBernsteinPoly(x; prior_strength = 0.5, prior_components = DiscreteNonParametric(1:25, fill(1/25, 25)));Evaluating the pdf and cdf
Distributions.pdf — Method
pdf(
bsm::RandomBernsteinPoly,
params::NamedTuple,
t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}
pdf(
bsm::RandomBernsteinPoly,
params::AbstractVector{NamedTuple},
t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}Evaluate $f(t\, |\, \boldsymbol{\eta})$ for a given RandomBernsteinPoly when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.
The named tuple should contain fields named :w.
Distributions.cdf — Method
cdf(
bsm::RandomBernsteinPoly,
params::NamedTuple,
t::Union{Real, AbstractVector{<:Real}}
) -> Union{Real, Vector{<:Real}}
cdf(
bsm::RandomBernsteinPoly,
params::AbstractVector{NamedTuple},
t::Union{Real, AbstractVector{<:Real}}
) -> Matrix{<:Real}Evaluate $F(t\, |\, \boldsymbol{\eta})$ for a given RandomBernsteinPoly when the model parameters of the NamedTuple params are given by $\boldsymbol{\eta}$.
The named tuple should contain fields named :w.
Utility functions
BayesDensityCore.hyperparams — Method
hyperparams(
rbp::RandomBernsteinPoly{T}
) where {T} -> @NamedTuple{prior_components::DiscreteNonParametric{Int, T}, prior_strength::T}Returns the hyperparameters of the random Bernstein polynomial rbp as a NamedTuple.
Markov chain Monte Carlo
StatsBase.sample — Method
sample(
[rng::Random.AbstractRNG],
rbp::RandomBernsteinPoly{T},
n_samples::Int;
n_burnin::Int = min(1000, div(n_samples, 5)),
initial_params::NamedTuple = _get_default_initparams_mcmc(rbp)
) where {T} -> PosteriorSamples{T}Generate n_samples posterior samples from a RandomBernsteinPoly using the telescope sampler.
Arguments
rng: Optional random seed used for random variate generation.rbp: TheRandomBernsteinPolyobject for which posterior samples are generated.n_samples: The total number of samples (including burn-in).
Keyword arguments
n_burnin: Number of burn-in samples.initial_params: Initial values used in the MCMC algorithm. Should be supplied as aNamedTuplewith a single fieldK, whereKis a positive integer. Defaults to the integer nearest to0.5*sqrt(n)which has positive prior probability, wherenis the sample size.
Returns
ps: APosteriorSamplesobject holding the posterior samples and the original model object.
Examples
julia> using Random
julia> x = (1.0 .- (1.0 .- LinRange(0.0, 1.0, 5000)) .^(1/3)).^(1/3);
julia> rbp = RandomBernsteinPoly(x);
julia> ps1 = sample(rbp, 5_000);
julia> ps2 = sample(rbp, 5_000; n_burnin=2_000, initial_params = (K = 5,));