| Version: | 0.2-1 | 
| Encoding: | UTF-8 | 
| Title: | Bayesian Mixtures with an Unknown Number of Components | 
| Description: | Fits Bayesian finite mixtures with an unknown number of components using the telescoping sampler and different component distributions. For more details see Frühwirth-Schnatter et al. (2021) <doi:10.1214/21-BA1294>. | 
| Depends: | R (≥ 4.0.0) | 
| Imports: | abind, bayesm, DirichletReg, extraDistr, graphics, grDevices, MCMCpack, methods, mvtnorm, stats | 
| VignetteBuilder: | knitr, rmarkdown | 
| Suggests: | invgamma, klaR, knitr, mclust, poLCA, rmarkdown | 
| License: | GPL-2 | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-08-18 00:15:57 UTC; gruen | 
| Author: | Gertraud Malsiner-Walli | 
| Maintainer: | Gertraud Malsiner-Walli <Gertraud.Malsiner-Walli@wu.ac.at> | 
| Repository: | CRAN | 
| Date/Publication: | 2025-08-18 10:30:08 UTC | 
Simulated multivariate binary data
Description
Simulated multivariate binary data with a 3-group structure where the variables are correlated within the groups.
Format
A data frame with 500 observations and 31 variables:
- y1
- binary variable coded 1 and 2 
- y2
- binary variable coded 1 and 2 
- y3
- binary variable coded 1 and 2 
- y4
- binary variable coded 1 and 2 
- y5
- binary variable coded 1 and 2 
- y6
- binary variable coded 1 and 2 
- y7
- binary variable coded 1 and 2 
- y8
- binary variable coded 1 and 2 
- y9
- binary variable coded 1 and 2 
- y10
- binary variable coded 1 and 2 
- y11
- binary variable coded 1 and 2 
- y12
- binary variable coded 1 and 2 
- y13
- binary variable coded 1 and 2 
- y14
- binary variable coded 1 and 2 
- y15
- binary variable coded 1 and 2 
- y16
- binary variable coded 1 and 2 
- y17
- binary variable coded 1 and 2 
- y18
- binary variable coded 1 and 2 
- y19
- binary variable coded 1 and 2 
- y20
- binary variable coded 1 and 2 
- y21
- binary variable coded 1 and 2 
- y22
- binary variable coded 1 and 2 
- y23
- binary variable coded 1 and 2 
- y24
- binary variable coded 1 and 2 
- y25
- binary variable coded 1 and 2 
- y26
- binary variable coded 1 and 2 
- y27
- binary variable coded 1 and 2 
- y28
- binary variable coded 1 and 2 
- y29
- binary variable coded 1 and 2 
- y30
- binary variable coded 1 and 2 
- z
- integer variable with values 1, 2, and 3 indicating group membership 
Solve label switching and identify mixture for a mixture of LCA models.
Description
Clustering of the draws in the point process representation (PPR) using
k-means clustering.
Usage
identifyLCAMixture(Func, Mu, Phi, Eta, S, centers)
Arguments
| Func | A numeric array of dimension  | 
| Mu | A numeric array of dimension  | 
| Phi | A numeric array of dimension  | 
| Eta | A numeric array of dimension  | 
| S | A numeric matrix of dimension  | 
| centers | An integer or a numeric matrix of dimension  | 
Details
The following steps are implemented:
- A functional of the draws of the component-specific parameters ( - Func) is passed to the function. The functionals of each component and iteration are stacked on top of each other in order to obtain a matrix where each row corresponds to the functional of one component.
- The functionals are clustered into - K_+clusters using- k-means clustering. For each functional a group label is obtained.
- The obtained labels of the functionals are used to construct a classification for each MCMC iteration. Those classifications which are a permutation of - (1,\ldots,K_+)are used to reorder the Mu and Eta draws and the assignment matrix S. This results in an identified mixture model.
- Note that only iterations resulting in permutations are used for parameter estimation and deriving the final partition. Those MCMC iterations where the obtained classifications of the functionals are not a permutation of - (1,\ldots,K_+)are discarded as no unique assignment of functionals to components can be made. If the non-permutation rate, i.e. the proportion of MCMC iterations where the obtained classifications of the functionals are not a permutation, is high, this is an indication of a poor clustering solution, as the functionals are not clearly separated.
Value
A named list containing:
-  "S": reordered assignments.
-  "Mu": reordered Mu matrix.
-  "Phi": reordered Phi matrix.
-  "Eta": reordered weights.
-  "non_perm_rate": proportion of draws where the clustering did not result in a permutation and hence no relabeling could be performed; this is the proportion of draws discarded.
Solve label switching and identify mixture.
Description
Clustering of the draws in the point process representation (PPR) using
k-means clustering.
Usage
identifyMixture(Func, Mu, Eta, S, centers)
Arguments
| Func | A numeric array of dimension  | 
| Mu | A numeric array of dimension  | 
| Eta | A numeric array of dimension  | 
| S | A numeric matrix of dimension  | 
| centers | An integer or a numeric matrix of dimension  | 
Details
The following steps are implemented:
- A functional of the draws of the component-specific parameters ( - Func) is passed to the function. The functionals of each component and iteration are stacked on top of each other in order to obtain a matrix where each row corresponds to the functional of one component.
- The functionals are clustered into - K_+clusters using- k-means clustering. For each functional a group label is obtained.
- The obtained labels of the functionals are used to construct a classification for each MCMC iteration. Those classifications which are a permutation of - (1,\ldots,K_+)are used to reorder the Mu and Eta draws and the assignment matrix S. This results in an identified mixture model.
- Note that only iterations resulting in permutations are used for parameter estimation and deriving the final partition. Those MCMC iterations where the obtained classifications of the functionals are not a permutation of - (1,\ldots,K_+)are discarded as no unique assignment of functionals to components can be made. If the non-permutation rate, i.e. the proportion of MCMC iterations where the obtained classifications of the functionals are not a permutation, is high, this is an indication of a poor clustering solution, as the functionals are not clearly separated.
Value
A named list containing:
-  "S": reordered assignments.
-  "Mu": reordered Mu matrix.
-  "Eta": reordered weights.
-  "non_perm_rate": proportion of draws where the clustering did not result in a permutation and hence no relabeling could be performed; this is the proportion of draws discarded.
Plot multivariate categorical data.
Description
Plots of the level combinations of pairs of variables are created where the size of the circle indicating a level combination is proportional to the frequency of the level combination.
Usage
plotBubble(x, bg = "grey")
Arguments
| x | A matrix or data.frame; the data consisting of categorical variables. | 
| bg | If specified, the symbols are filled with colour(s), the
vector  | 
Value
NULL
Examples
with(chickwts, plotBubble(data.frame(cut(weight, 100 * 1:5), feed)))
Pairwise scatter plots of the data.
Description
Scatter plots of the observations are plotted by selecting pairs of dimensions, potentially colored by a known classification.
Usage
plotScatter(x, z, label = "", trim = 0)
Arguments
| x | A matrix or data.frame; the data consisting of metric variables. | 
| z | A vector; indicating the color to use for the observations. | 
| label | A character string; the text to include in the axes labels. | 
| trim | A scalar numeric in  | 
Value
NULL
Examples
plotScatter(iris[, 1:4], iris$Species, label = "dim")
Specify prior on \alpha.
Description
Obtain a function to evaluate the log prior specified
for \alpha.
Usage
priorOnAlpha_spec(H = c("alpha_const", "gam_05_05", "gam_1_2", "F_6_3"))
Arguments
| H | A character indicating which specification should be used. | 
Details
The following prior specifications are supported:
-  "alpha_const":\alphais fixed at 1.
-  "gam_05_05":\alpha \simgamma(0.5, 0.5), i.e., shape = 0.5, rate = 0.5.
-  "gam_1_2":\alpha \simgamma(1, 2), i.e., shape = 1, rate = 2.
-  "F_6_3":\alpha \simF(6, 3), i.e., an F-distribution with degrees of freedom equal to 6 and 3.
Value
A named list containing:
-  "log_pAlpha": a function of the log prior of\alpha.
-  "param": a list with the parameters.
Specify prior on e0.
Description
Obtain a function to evaluate the log prior specified
for e_0.
Usage
priorOnE0_spec(E = c("G_1_20", "e0const"), e0)
Arguments
| E | A character indicating which specification should be used. | 
| e0 | A numeric scalar giving the fixed value of  | 
Details
The following prior specifications are supported:
-  "G_1_20":e_0 \simgamma(1, 20), i.e., shape = 1, rate = 20.
-  "e0const":e_0is fixed ate0.
Value
A named list containing:
-  "log_p_e0": a function of the log prior ofe_0.
-  "param": a list with the parameters.
Specify prior on K.
Description
Obtain a function to evaluate the log prior
specified for K.
Usage
priorOnK_spec(
  P = c("fixedK", "Unif", "BNB_111", "BNB_212", "BNB_143", "BNB_443", "BNB_943",
    "Pois_1", "Pois_4", "Pois_9", "Geom_05", "Geom_02", "Geom_01", "NB_11", "NB_41",
    "NB_91"),
  K
)
Arguments
| P | A character indicating which specification should be used. See Details for suitable values. | 
| K | A numeric or integer scalar specifying the fixed (if  | 
Details
The following prior specifications are supported:
-  "fixedK": K has the fixed value K (second argument).
-  "Unif":K \simUnif[1,K], where the upper limit is given by K (second argument).
-  "BNB_111":K-1 \simBNB(1,1,1), i.e.,K-1follows a beta-negative binomial distribution with parameters(1,1,1).
-  "BNB_212":K-1 \simBNB(2,1,2), i.e.,K-1follows a beta-negative binomial distribution with parameters(2,1,2).
-  "BNB_143":K-1 \simBNB(1,2,1), i.e.,K-1follows a beta-negative binomial distribution with parameters(1,4,3).
-  "BNB_443":K-1 \simBNB(4,4,3), i.e.,K-1follows a beta-negative binomial distribution with parameters(4,4,3).
-  "BNB_943":K-1 \simBNB(9,4,3), i.e.,K-1follows a beta-negative binomial distribution with parameters(9,4,3).
-  "Pois_1":K-1 \simpois(1), i.e.,K-1follows a Poisson distribution with rate 1.
-  "Pois_4":K-1 \simpois(4), i.e.,K-1follows a Poisson distribution with rate 4.
-  "Pois_9":K-1 \simpois(9), i.e.,K-1follows a Poisson distribution with rate 9.
-  "Geom_05":K-1 \simgeom(0.5), i.e.,K-1follows a geometric distribution with success probabilityp=0.5and densityf(x)=p(1-p)^x.
-  "Geom_02":K-1 \simgeom(0.2), i.e.,K-1follows a geometric distribution with success probabilityp=0.2and densityf(x)=p(1-p)^x.
-  "Geom_01":K-1 \simgeom(0.1), i.e.,K-1follows a geometric distribution with success probabilityp=0.1and densityf(x)=p(1-p)^x.
-  "NB_11":K-1 \simnbinom(1,0.5), i.e.,K-1follows a negative-binomial distribution withsize=1andp=0.5.
-  "NB_41":K-1 \simnbinom(4,0.5), i.e.,K-1follows a negative-binomial distribution withsize=4andp=0.5.
-  "NB_91":K-1 \simnbinom(9,0.5), i.e.,K-1follows a negative-binomial distribution withsize=9andp=0.5.
Value
A named list containing:
-  "log_pK": a function of the log prior ofK.
-  "param": a list with the parameters.
Sample alpha conditional on partition and K using an Metropolis-Hastings step with log-normal proposal.
Description
Sample \alpha conditional on the current
partition and value of K using an Metropolis-Hastings
step with log-normal proposal.
Usage
sampleAlpha(N, Nk, K, alpha, s0_proposal, log_pAlpha)
Arguments
| N | A number; indicating the sample size. | 
| Nk | An integer vector; indicating the group sizes in the partition. | 
| K | A number; indicating the number of components. | 
| alpha | A numeric value; indicating the value for  | 
| s0_proposal | A numeric value; indicating the standard deviation of the random walk. | 
| log_pAlpha | A function; evaluating the log prior of  | 
Value
A named list containing:
-  "alpha": a numeric, the new\alphavalue.
-  "acc": logical indicating acceptance.
Sample e0 conditional on partition and K using an Metropolis-Hastings step with log-normal proposal.
Description
Sample e_0 conditional on the current partition
and value of K using an Metropolis-Hastings step with
log-normal proposal.
Usage
sampleE0(K, Kp, N, Nk, s0_proposal, e0, log_p_e0)
Arguments
| K | A number; indicating the number of components. | 
| Kp | A number; indicating the number of filled components  | 
| N | A number; indicating the sample size. | 
| Nk | An integer vector; indicating the group sizes in the partition. | 
| s0_proposal | A numeric value; indicating the standard deviation of the random walk proposal. | 
| e0 | A numeric value; indicating the current value of  | 
| log_p_e0 | A function; evaluating the log prior of  | 
Value
A named list containing:
-  "e0": a numeric, the newe_0value.
-  "acc": logical indicating acceptance.
Sample K conditional on \alpha where e0 = \alpha/K.
Description
This sampling step only relies on the current partition and is independent of the current component-specific parameters, see Frühwirth-Schnatter et al (2021).
Usage
sampleK_alpha(Kp_j, Kmax, Nk_j, alpha, log_pK)
Arguments
| Kp_j | A number; indicating the current value of  | 
| Kmax | A number; indicating the maximum value of  | 
| Nk_j | A numeric vector; indicating the group sizes in the partition, i.e., the current number of observations in the filled components. | 
| alpha | A number; indicating the value of the parameter  | 
| log_pK | A function; evaluating the log prior of  | 
Value
A number indicating the new value of K.
Sample K conditional on e0 (fixed or random, but not depending on K).
Description
This sampling step only relies on the current partition and is independent of the current component-specific parameters, see Frühwirth-Schnatter et al (2021).
Usage
sampleK_e0(Kp_j, Kmax, log_pK, log_p_e0, e0, N)
Arguments
| Kp_j | A number; indicating the current value of  | 
| Kmax | A number; indicating the maximum value of  | 
| log_pK | A function; evaluating the prior of  | 
| log_p_e0 | A function; evaluating the log prior of  | 
| e0 | A number; indicating the value of  | 
| N | A number; indicating the number of observations. | 
Value
A number indicating the new value of K.
Telescoping sampling of the LCA model where a prior on the number of components K is specified.
Description
- The MCMC scheme is implemented as suggested in Frühwirth-Schnatter et al (2021). 
- The priors on the model parameters are specified as in Frühwirth-Schnatter et al (2021), see the vignette for details and notation. 
Usage
sampleLCA(
  y,
  S,
  pi,
  eta,
  a0,
  M,
  burnin,
  thin,
  Kmax,
  G = c("MixDynamic", "MixStatic"),
  priorOnK,
  priorOnWeights,
  verbose = FALSE
)
Arguments
| y | A numeric matrix; containing the data. | 
| S | A numeric matrix; containing the initial cluster assignments. | 
| pi | A numeric vector; containing the initial cluster-specific success probabilities. | 
| eta | A numeric vector; containing the initial cluster sizes. | 
| a0 | A numeric vector; containing the parameters of the prior on the cluster-specific success probabilities. | 
| M | A numeric scalar; specifying the number of recorded iterations. | 
| burnin | A numeric scalar; specifying the number of burn-in iterations. | 
| thin | A numeric scalar; specifying the thinning used for the iterations. | 
| Kmax | A numeric scalar; the maximum number of components. | 
| G | A character string; either  | 
| priorOnK | A named list; providing the prior on the number of components K, see  | 
| priorOnWeights | A named list; providing the prior on the mixture weights. | 
| verbose | A logical; indicating if some intermediate clustering results should be printed. | 
Value
A named list containing:
-  "Pi": sampled component-specific success probabilities.
-  "Eta": sampled weights.
-  "S": sampled assignments.
-  "Nk": number of observations assigned to the different components, for each iteration.
-  "K": sampled number of components.
-  "Kplus": number of filled, i.e., non-empty components, for each iteration.
-  "e0": sampled Dirichlet parameter of the prior on the weights (ife_0is random).
-  "alpha": sampled Dirichlet parameter of the prior on the weights (if\alphais random).
-  "acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling eithere_0or\alpha.
Examples
if (requireNamespace("poLCA", quietly = TRUE)) {
    data("carcinoma", package = "poLCA")
    y <- carcinoma
    N <- nrow(y)
    r <- ncol(y)
    
    M <- 200
    thin <- 1
    burnin <- 100
    Kmax <- 50  
    Kinit <- 10
    
    G <- "MixDynamic"
    priorOnAlpha <- priorOnAlpha_spec("gam_1_2")
    priorOnK <- priorOnK_spec("Pois_1")
    
    cat <- apply(y, 2, max)
    a0 <- rep(1, sum(cat))
    cl_y <- kmeans(y, centers = Kinit, iter.max = 20)
    S_0 <- cl_y$cluster
    eta_0 <- cl_y$size/N
    pi_0 <- do.call("cbind", lapply(1:r, function(j) {
        prop.table(table(S_0, y[, j]), 1)
    }))
    result <- sampleLCA(
        y, S_0, pi_0, eta_0, a0, 
        M, burnin, thin, Kmax, 
        G, priorOnK, priorOnAlpha)
    K <- result$K
    Kplus <- result$Kplus   
    
    plot(K, type = "l", ylim = c(0, max(K)),  
         xlab = "iteration", main = "",
         ylab = expression("K" ~ "/" ~ K["+"]), col = 1)
    lines(Kplus, col = 2)
    legend("topright", legend = c("K", expression(K["+"])),
           col = 1:2, lty = 1, box.lwd = 0)
}
Telescoping sampling of the mixture of LCA models where a prior on the number of components K is specified.
Description
- The MCMC scheme is implemented as suggested in Malsiner-Walli et al (2024). 
- Also the priors on the model parameters are specified as in Malsiner-Walli et al (2024), see the vignette for details and notation. 
Usage
sampleLCAMixture(
  y,
  S,
  L,
  pi,
  eta,
  mu,
  phi,
  a_00,
  a_mu,
  a_phi,
  b_phi,
  c_phi,
  d_phi,
  M,
  burnin,
  thin,
  Kmax,
  s_mu,
  s_phi,
  eps,
  G,
  priorOnWeights,
  d0,
  priorOnK,
  verbose = FALSE
)
Arguments
| y | A numeric matrix; containing the data where categories are coded with numbers. | 
| S | A numeric matrix; containing the initial cluster assignments. | 
| L | A numeric scalar; specifiying the number of classes within each component. | 
| pi | A numeric matrix; containing the initial class-specific occurrence probabilities. | 
| eta | A numeric vector; containing the initial cluster sizes. | 
| mu | A numeric matrix; containing the initial central component occurrence probabilities. | 
| phi | A numeric matrix; containing the initial component- and variable-specific precisions. | 
| a_00 | A numeric scalar; specifying the prior parameter a_00. | 
| a_mu | A numeric vector; containing the prior parameter a_mu. | 
| a_phi | A numeric vector; containing the prior parameter a_phi for each variable. | 
| b_phi | A numeric vector; containing the initial value of b_phi for each variable. | 
| c_phi | A numeric vector; containing the prior parameter c_phi for each variable. | 
| d_phi | A numeric vector; containing the prior parameter d_phi for each variable. | 
| M | A numeric scalar; specifying the number of recorded iterations. | 
| burnin | A numeric scalar; specifying the number of burn-in iterations. | 
| thin | A numeric scalar; specifying the thinning used for the iterations. | 
| Kmax | A numeric scalar; the maximum number of components. | 
| s_mu | A numeric scalar; specifying the standard deviation of the proposal in the Metropolis-Hastings step when sampling mu. | 
| s_phi | A numeric scalar; specifying the standard deviation of the proposal in the Metropolis-Hastings step when sampling phi. | 
| eps | A numeric scalar; a regularizing constant to bound the Dirichlet proposal away from the boundary in the Metropolis-Hastings step when sampling mu. | 
| G | A character string; either  | 
| priorOnWeights | A named list; providing the prior on the mixture weights. | 
| d0 | A numeric scalar; containing the Dirichlet prior parameter on the class weights. | 
| priorOnK | A named list; providing the prior on the number of components K, see  | 
| verbose | A logical; indicating if some intermediate clustering results should be printed. | 
Value
A named list containing:
-  "Eta": sampled weights.
-  "S": sampled assignments.
-  "K": sampled number of components.
-  "Kplus": number of filled, i.e., non-empty components, for each iteration.
-  "Nk": number of observations assigned to the different components, for each iteration.
-  "Nl": number of observations assigned to the different classes within the components, for each iteration.
-  "e0": sampled Dirichlet parameter of the prior on the weights (ife_0is random).
-  "alpha": sampled Dirichlet parameter of the prior on the weights (if\alphais random).
-  "acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling eithere_0or\alpha.
-  "Mu": sampled central component occurrence probabilities.
-  "Phi": sampled precisions.
-  "acc_mu": the acceptance rate in the Metropolis-Hastings step when sampling\mu_{k,j}.
-  "acc_phi": the acceptance rate in the Metropolis-Hastings step when sampling\phi_{k,j}.
-  "nonnormpost_mode": parameter values corresponding to the mode of the nonnormalized posterior.
-  "Pi_k": sampled weighted component occurrence probabilities.
Examples
data("SimData", package = "telescope")
y <- as.matrix(SimData[, 1:30])
z <- SimData[, 31]
N <- nrow(y)
r <- ncol(y)
    
M <- 5
thin <- 1
burnin <- 0
Kmax <- 50  
Kinit <- 10
    
G <- "MixDynamic"
priorOnAlpha <- priorOnAlpha_spec("gam_1_2")
priorOnK <- priorOnK_spec("Pois_1")
d0 <- 1  
cat <- apply(y, 2, max)
a_mu <- rep(20, sum(cat))
mu_0 <- matrix(rep(rep(1/cat, cat), Kinit),
  byrow = TRUE, nrow = Kinit)
c_phi <- 30; d_phi <- 1; b_phi <- rep(10, r)
a_phi <- rep(1, r)
phi_0 <- matrix(cat, Kinit, r, byrow = TRUE)
a_00 <- 0.05
s_mu <- 2; s_phi <- 2; eps <- 0.01 
set.seed(1234)
cl_y <- kmeans(y, centers = Kinit, nstart = 100, iter.max = 50)
S_0 <- cl_y$cluster
eta_0 <- cl_y$size/N
I_0 <- rep(1L, N)
L <- 2
for (k in 1:Kinit) {
  cl_size <- sum(S_0 == k)
  I_0[S_0 == k] <- rep(1:L, length.out = cl_size)
}
index <- c(0, cumsum(cat))
low <- (index + 1)[-length(index)]
up <- index[-1]
pi_km <- array(NA_real_, dim = c(Kinit, L, sum(cat)))
rownames(pi_km) <- paste0("k_", 1:Kinit)
for (k in 1:Kinit) {
  for (l in 1:L) {
    index <- (S_0 == k) & (I_0 == l)
    for (j in 1:r) {
      pi_km[k, l, low[j]:up[j]] <- tabulate(y[index, j], cat[j]) / sum(index)
    }
  }
}
pi_0 <- pi_km 
result <- sampleLCAMixture(
    y, S_0, L,
    pi_0, eta_0, mu_0, phi_0,
    a_00, a_mu, a_phi, b_phi, c_phi, d_phi,
    M, burnin, thin, Kmax,
    s_mu, s_phi, eps,
    G, priorOnAlpha, d0, priorOnK)
Telescoping sampling of a Bayesian finite multivariate Gaussian mixture where a prior on the number of components is specified.
Description
- The MCMC scheme is implemented as suggested in Frühwirth-Schnatter et al (2021). 
- The priors on the model parameters are specified as in Frühwirth-Schnatter et al (2021), see the vignette for details and notation. 
- The parameterizations of the Wishart and inverse Wishart distribution are used as in Frühwirth-Schnatter et al (2021), see also the vignette. 
Usage
sampleMultNormMixture(
  y,
  S,
  mu,
  Sigma,
  eta,
  c0,
  g0,
  G0,
  C0,
  b0,
  B0,
  M,
  burnin,
  thin,
  Kmax,
  G = c("MixDynamic", "MixStatic"),
  priorOnK,
  priorOnWeights,
  verbose = FALSE
)
Arguments
| y | A numeric matrix; containing the data. | 
| S | A numeric matrix; containing the initial cluster assignments. | 
| mu | A numeric matrix; containing the initial cluster-specific mean values. | 
| Sigma | A numeric matrix; containing the initial cluster-specific variance covariance values. | 
| eta | A numeric vector; containing the initial cluster sizes. | 
| c0 | A numeric vector; hyperparameter of the prior on  | 
| g0 | A numeric vector; hyperparameter of the prior on  | 
| G0 | A numeric vector; hyperparameter of the prior on  | 
| C0 | A numeric vector; initial value of the hyperparameter  | 
| b0 | A numeric vector; hyperparameter of the prior on  | 
| B0 | A numeric vector; hyperparameter of the prior on  | 
| M | A numeric scalar; specifying the number of recorded iterations. | 
| burnin | A numeric scalar; specifying the number of burn-in iterations. | 
| thin | A numeric scalar; specifying the thinning used for the iterations. | 
| Kmax | A numeric scalar; the maximum number of components. | 
| G | A character string; either  | 
| priorOnK | A named list; providing the prior on the number of components K, see  | 
| priorOnWeights | A named list; providing the prior on the mixture weights. | 
| verbose | A logical; indicating if some intermediate clustering results should be printed. | 
Value
A named list containing:
-  "Mu": sampled component means.
-  "Eta": sampled weights.
-  "S": sampled assignments.
-  "Nk": number of observations assigned to the different components, for each iteration.
-  "K": sampled number of components.
-  "Kplus": number of filled, i.e., non-empty components, for each iteration.
-  "e0": sampled Dirichlet parameter of the prior on the weights (ife_0is random).
-  "alpha": sampled Dirichlet parameter of the prior on the weights (if\alphais random).
-  "acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling eithere_0or\alpha.
Examples
y <- iris[, 1:4]
z <- iris$Species
r <- ncol(y)
M <- 50
thin <- 1
burnin <- 0
Kmax <- 40  
Kinit <- 10
G <- "MixStatic"      
priorOnE0 <- priorOnE0_spec("G_1_20", 1)
priorOnK <- priorOnK_spec("BNB_143")
R <- apply(y, 2, function(x) diff(range(x)))
b0 <- apply(y, 2, median)
B_0 <- rep(1, r)  
B0 <- diag((R^2) * B_0)
c0 <- 2.5 + (r-1)/2
g0 <- 0.5 + (r-1)/2
G0 <- 100 * g0/c0 * diag((1/R^2), nrow = r)
C0 <- g0 * chol2inv(chol(G0))
cl_y <- kmeans(y, centers = Kinit, nstart = 100)
S_0 <- cl_y$cluster
mu_0 <- t(cl_y$centers)
eta_0 <- rep(1/Kinit, Kinit)
Sigma_0 <- array(0, dim = c(r, r, Kinit))
Sigma_0[, , 1:Kinit] <- 0.5 * C0
result <- sampleMultNormMixture(
  y, S_0, mu_0, Sigma_0, eta_0,
  c0, g0, G0, C0, b0, B0,  
  M, burnin, thin, Kmax, G, priorOnK, priorOnE0)
K <- result$K
Kplus <- result$Kplus   
plot(K, type = "l", ylim = c(0, max(K)),
     xlab = "iteration", main = "",
     ylab = expression("K" ~ "/" ~ K["+"]), col = 1)
lines(Kplus, col = 2)
legend("topright", legend = c("K", expression(K["+"])),
       col = 1:2, lty = 1, box.lwd = 0)
Telescoping sampling of a Bayesian finite Poisson mixture with a prior on the number of components K.
Description
- The MCMC scheme is implemented as suggested in Frühwirth-Schnatter et al (2021). 
- The priors on the model parameters are specified as in Frühwirth-Schnatter et al (2021) and Früwirth-Schnatter and Malsiner-Walli (2019), see the vignette for details and notation. 
Usage
samplePoisMixture(
  y,
  S,
  mu,
  eta,
  a0,
  b0,
  h0,
  H0,
  M,
  burnin,
  thin,
  Kmax,
  G = c("MixDynamic", "MixStatic"),
  priorOnK,
  priorOnWeights,
  verbose = FALSE
)
Arguments
| y | A numeric matrix; containing the data. | 
| S | A numeric matrix; containing the initial cluster assignments. | 
| mu | A numeric matrix; containing the initial cluster-specific rate values. | 
| eta | A numeric vector; containing the initial cluster sizes. | 
| a0 | A numeric vector; hyperparameter of the prior on the rate  | 
| b0 | A numeric vector; hyperparameter of the prior on the rate  | 
| h0 | A numeric vector; hyperparameter of the prior on the rate  | 
| H0 | A numeric vector; hyperparameter of the prior on the rate  | 
| M | A numeric scalar; specifying the number of recorded iterations. | 
| burnin | A numeric scalar; specifying the number of burn-in iterations. | 
| thin | A numeric scalar; specifying the thinning used for the iterations. | 
| Kmax | A numeric scalar; the maximum number of components. | 
| G | A character string; either  | 
| priorOnK | A named list; providing the prior on the number of components K, see  | 
| priorOnWeights | A named list; providing the prior on the mixture weights. | 
| verbose | A logical; indicating if some intermediate clustering results should be printed. | 
Value
A named list containing:
-  "Mu": sampled rate\mu.
-  "Eta": sampled weights.
-  "S": sampled assignments.
-  "Nk": number of observations assigned to the different components, for each iteration.
-  "K": sampled number of components.
-  "Kplus": number of filled, i.e., non-empty components, for each iteration.
-  "e0": sampled Dirichlet parameter of the prior on the weights (ife_0is random).
-  "alpha": sampled Dirichlet parameter of the prior on the weights (if\alphais random).
-  "acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling either e0 or\alpha.
Examples
N <- 200
z <- sample(1:2, N, prob = c(0.5, 0.5), replace = TRUE)
y <- rpois(N, c(1, 6)[z])
M <- 200
thin <- 1
burnin <- 100
Kmax <- 50  
Kinit <- 10
G <- "MixDynamic"
priorOnAlpha <- priorOnAlpha_spec("gam_1_2")
priorOnK <- priorOnK_spec("BNB_143")
a0 <- 0.1 
h0 <- 0.5 
b0 <- a0/mean(y) 
H0 <- h0/b0
cl_y <- kmeans(y, centers = Kinit, nstart = 100)
S_0 <- cl_y$cluster
mu_0 <- t(cl_y$centers)
eta_0 <- rep(1/Kinit, Kinit)
result <- samplePoisMixture(
  y, S_0, mu_0, eta_0, 
  a0, b0, h0, H0,
  M, burnin, thin, Kmax, 
  G, priorOnK, priorOnAlpha)
K <- result$K
Kplus <- result$Kplus
plot(K, type = "l", ylim = c(0, max(K)), 
     xlab = "iteration", main = "",
     ylab = expression("K" ~ "/" ~ K["+"]), col = 1)
lines(Kplus, col = 2)
legend("topright", legend = c("K", expression(K["+"])),
       col = 1:2, lty = 1, box.lwd = 0)
Telescoping sampling of a Bayesian finite univariate Gaussian mixture where a prior on the number of components K is specified.
Description
- The MCMC scheme is implemented as suggested in Frühwirth-Schnatter et al (2021). 
- The priors on the model parameters are specified as in Frühwirth-Schnatter et al (2021), see the vignette for details and notation. 
- The parametrizations of the gamma and inverse gamma distribution are used as in Frühwirth-Schnatter et al (2021), see also the vignette. 
Usage
sampleUniNormMixture(
  y,
  S,
  mu,
  sigma2,
  eta,
  c0,
  g0,
  G0,
  C0_0,
  b0,
  B0,
  M,
  burnin,
  thin,
  Kmax,
  G = c("MixDynamic", "MixStatic"),
  priorOnK,
  priorOnWeights,
  verbose = FALSE
)
Arguments
| y | A numeric matrix; containing the data. | 
| S | A numeric matrix; containing the initial cluster assignments. | 
| mu | A numeric matrix; containing the initial cluster-specific mean values. | 
| sigma2 | A numeric matrix; containing the initial cluster-specific variance values. | 
| eta | A numeric vector; containing the initial cluster sizes. | 
| c0 | A numeric vector; hyperparameter of the prior on  | 
| g0 | A numeric vector; hyperparameter of the prior on  | 
| G0 | A numeric vector; hyperparameter of the prior on  | 
| C0_0 | A numeric vector; initial value of hyperparameter  | 
| b0 | A numeric vector; hyperparameter of the prior on  | 
| B0 | A numeric vector; hyperparameter of the prior on  | 
| M | A numeric scalar; specifying the number of recorded iterations. | 
| burnin | A numeric scalar; specifying the number of burn-in iterations. | 
| thin | A numeric scalar; specifying the thinning used for the iterations. | 
| Kmax | A numeric scalar; the maximum number of components. | 
| G | A character string; either  | 
| priorOnK | A named list; providing the prior on the number of
components K, see  | 
| priorOnWeights | A named list; providing the prior on the mixture weights. | 
| verbose | A logical; indicating if some intermediate clustering results should be printed. | 
Value
A named list containing:
-  "Mu": sampled component means.
-  "Sigma2": sampled component component variances.
-  "Eta": sampled weights.
-  "S": sampled assignments.
-  "Nk": number of observations assigned to the different components, for each iteration.
-  "K": sampled number of components.
-  "Kplus": number of filled, i.e., non-empty components, for each iteration.
-  "e0": sampled Dirichlet parameter of the prior on the weights (ife_0is random).
-  "alpha": sampled Dirichlet parameter of the prior on the weights (if\alphais random).
-  "acc": logical vector indicating acceptance in the Metropolis-Hastings step when sampling eithere_0or\alpha.
Examples
if (requireNamespace("mclust", quietly = TRUE)) {
    data("acidity", package = "mclust")
    y <- acidity
    
    N <- length(y)
    r <- 1
    
    M <- 200
    thin <- 1
    burnin <- 100
    Kmax <- 50  
    Kinit <- 10
    
    G <- "MixStatic" 
    priorOnE0 <- priorOnE0_spec("e0const", 0.01)
    priorOnK <- priorOnK_spec("Pois_1", 50)
    
    R <- diff(range(y))
    c0 <- 2 + (r-1)/2
    C0 <- diag(c(0.02*(R^2)), nrow = r)
    g0 <- 0.2 + (r-1) / 2
    G0 <- diag(10/(R^2), nrow = r)
    B0 <- diag((R^2), nrow = r)
    b0 <- as.matrix((max(y) + min(y))/2, ncol = 1)  
    
    cl_y <- kmeans(y, centers = Kinit, nstart = 100)
    S_0 <- cl_y$cluster
    mu_0 <- t(cl_y$centers)
    eta_0 <- rep(1/Kinit, Kinit)
    sigma2_0 <- array(0, dim = c(1, 1, Kinit))
    sigma2_0[1, 1, ] <- 0.5 * C0
    result <- sampleUniNormMixture(
        y, S_0, mu_0, sigma2_0, eta_0,
        c0, g0, G0, C0, b0, B0,
        M, burnin, thin, Kmax,
        G, priorOnK, priorOnE0)
    
    K <- result$K
    Kplus <- result$Kplus  
    
    plot(K, type = "l", ylim = c(0, max(K)),
         xlab = "iteration", main = "",
         ylab = expression("K" ~ "/" ~ K["+"]), col = 1)
    lines(Kplus, col = 2)
    legend("topright", legend = c("K", expression(K["+"])),
           col = 1:2, lty = 1, box.lwd = 0)
}