| Type: | Package | 
| Title: | Sample-Based Estimation of Kullback-Leibler Divergence | 
| Version: | 1.0.0 | 
| Maintainer: | Niklas Hartung <niklas.hartung@gmail.com> | 
| Description: | Estimation algorithms for Kullback-Leibler divergence between two probability distributions, based on one or two samples, and including uncertainty quantification. Distributions can be uni- or multivariate and continuous, discrete or mixed. | 
| License: | MIT + file LICENSE | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.3 | 
| Imports: | stats, RANN | 
| Suggests: | knitr, rmarkdown, KernSmooth, testthat (≥ 3.0.0) | 
| Config/testthat/edition: | 3 | 
| Config/Needs/website: | ggplot2, reshape2, MASS | 
| URL: | https://niklhart.github.io/kldest/ | 
| BugReports: | https://github.com/niklhart/kldest/issues | 
| NeedsCompilation: | no | 
| Packaged: | 2024-04-08 13:15:50 UTC; niklashartung | 
| Author: | Niklas Hartung  | 
| Repository: | CRAN | 
| Date/Publication: | 2024-04-09 08:20:02 UTC | 
kldest: Sample-Based Estimation of Kullback-Leibler Divergence
Description
Estimation algorithms for Kullback-Leibler divergence between two probability distributions, based on one or two samples, and including uncertainty quantification. Distributions can be uni- or multivariate and continuous, discrete or mixed.
Author(s)
Maintainer: Niklas Hartung niklas.hartung@gmail.com (ORCID) [copyright holder]
See Also
Useful links:
Combinations of input arguments
Description
Combinations of input arguments
Usage
combinations(...)
Arguments
... | 
 Any number of atomic vectors.  | 
Value
A data frame with columns named as the inputs, containing all input combinations.
Examples
combinations(a = 1:2, b = letters[1:3], c = LETTERS[1:2])
Constant plus diagonal matrix
Description
Specify a matrix with constant values on the diagonal and on the off-diagonals. Such matrices can be used to vary the degree of dependency in covariate matrices, for example when evaluating accuracy of KL-divergence estimation algorithms.
Usage
constDiagMatrix(dim = 1, diag = 1, offDiag = 0)
Arguments
dim | 
 Dimension  | 
diag | 
 Value at the diagonal  | 
offDiag | 
 Value at off-diagonals  | 
Value
A dim-by-dim matrix
Examples
constDiagMatrix(dim = 3, diag = 1, offDiag = 0.9)
Empirical convergence rate of a KL divergence estimator
Description
Subsampling-based confidence intervals computed by kld_ci_subsampling()
require the convergence rate of the KL divergence estimator as an input. The
default rate of 0.5 assumes that the variance term dominates the bias term.
For high-dimensional problems, depending on the data, the convergence rate
might be lower. This function allows to empirically derive the convergence
rate.
Usage
convergence_rate(
  estimator,
  X,
  Y = NULL,
  q = NULL,
  n.sizes = 4,
  spacing.factor = 1.5,
  typical.subsample = function(n) sqrt(n),
  B = 500L,
  plot = FALSE
)
Arguments
estimator | 
 A KL divergence estimator.  | 
X, Y | 
 
  | 
q | 
 The density function of the approximate distribution   | 
n.sizes | 
 Number of different subsample sizes to use (default:   | 
spacing.factor | 
 Multiplicative factor controlling the spacing of sample
sizes (default:   | 
typical.subsample | 
 A function that produces a typical subsample size,
used as the geometric mean of subsample sizes (default:   | 
B | 
 Number of subsamples to draw per subsample size.  | 
plot | 
 A boolean (default:   | 
Details
References:
Politis, Romano and Wolf, "Subsampling", Chapter 8 (1999), for theory.
The implementation has been adapted from lecture notes by C. J. Geyer, https://www.stat.umn.edu/geyer/5601/notes/sub.pdf
Value
A scalar, the parameter \beta in the empirical convergence
rate n^-\beta of the estimator to the true KL divergence.
It can be used in the convergence.rate argument of kld_ci_subsampling()
as convergence.rate = function(n) n^beta.
Examples
    # NN method usually has a convergence rate around 0.5:
    set.seed(0)
    convergence_rate(kld_est_nn, X = rnorm(1000), Y = rnorm(1000, mean = 1, sd = 2))
Detect if a one- or two-sample problem is specified
Description
Detect if a one- or two-sample problem is specified
Usage
is_two_sample(Y, q)
Arguments
Y | 
 A vector, matrix, data frame or   | 
q | 
 A function or   | 
Value
TRUE for a two-sample problem (i.e., Y non-null and q = NULL)
and FALSE for a one-sample problem (i.e., Y = NULL and q non-null).
Uncertainty of KL divergence estimate using Efron's bootstrap.
Description
This function computes a confidence interval for KL divergence based on Efron's bootstrap. The approach only works for kernel density-based estimators since nearest neighbour-based estimators cannot deal with the ties produced when sampling with replacement.
Usage
kld_ci_bootstrap(
  X,
  Y,
  estimator = kld_est_kde1,
  B = 500L,
  alpha = 0.05,
  method = c("quantile", "se"),
  include.boot = FALSE,
  ...
)
Arguments
X, Y | 
 
  | 
estimator | 
 A function expecting two inputs   | 
B | 
 Number of bootstrap replicates (default:   | 
alpha | 
 Error level, defaults to   | 
method | 
 Either   | 
include.boot | 
 Boolean,   | 
... | 
 Arguments passed on to   | 
Details
Reference:
Efron, "Bootstrap Methods: Another Look at the Jackknife", The Annals of Statistics, Vol. 7, No. 1 (1979).
Value
A list with the following fields:
-  
"est"(the estimated KL divergence), -  
"boot"(a lengthBnumeric vector with KL divergence estimates on the bootstrap subsamples), only included ifinclude.boot = TRUE, -  
"ci"(a length2vector containing the lower and upper limits of the estimated confidence interval). 
Examples
# 1D Gaussian, two samples
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_kde1(X, Y)
kld_ci_bootstrap(X, Y)
Uncertainty of KL divergence estimate using Politis/Romano's subsampling bootstrap.
Description
This function computes a confidence interval for KL divergence based on the subsampling bootstrap introduced by Politis and Romano. See Details for theoretical properties of this method.
Usage
kld_ci_subsampling(
  X,
  Y = NULL,
  q = NULL,
  estimator = kld_est_nn,
  B = 500L,
  alpha = 0.05,
  subsample.size = function(x) x^(2/3),
  convergence.rate = sqrt,
  method = c("quantile", "se"),
  include.boot = FALSE,
  n.cores = 1L,
  ...
)
Arguments
X, Y | 
 
  | 
q | 
 The density function of the approximate distribution   | 
estimator | 
 The Kullback-Leibler divergence estimation method; a
function expecting two inputs (  | 
B | 
 Number of bootstrap replicates (default:   | 
alpha | 
 Error level, defaults to   | 
subsample.size | 
 A function specifying the size of the subsamples,
defaults to   | 
convergence.rate | 
 A function computing the convergence rate of the
estimator as a function of sample sizes. Defaults to   | 
method | 
 Either   | 
include.boot | 
 Boolean,   | 
n.cores | 
 Number of cores to use in parallel computing (defaults to   | 
... | 
 Arguments passed on to   | 
Details
In general terms, tetting b_n be the subsample size for a sample of
size n, and \tau_n the convergence rate of the estimator, a
confidence interval calculated by subsampling has asymptotic coverage
1 - \alpha as long as b_n/n\rightarrow 0,
b_n\rightarrow\infty and \frac{\tau_{b_n}}{\tau_n}\rightarrow 0.
In many cases, the convergence rate of the nearest-neighbour based KL
divergence estimator is \tau_n = \sqrt{n} and the condition on the
subsample size reduces to b_n/n\rightarrow 0 and b_n\rightarrow\infty.
By default, b_n = n^{2/3}. In a two-sample problem, n and b_n
are replaced by effective sample sizes n_\text{eff} = \min(n,m) and
b_{n,\text{eff}} = \min(b_n,b_m).
Reference:
Politis and Romano, "Large sample confidence regions based on subsamples under minimal assumptions", The Annals of Statistics, Vol. 22, No. 4 (1994).
Value
A list with the following fields:
-  
"est"(the estimated KL divergence), -  
"ci"(a length2vector containing the lower and upper limits of the estimated confidence interval). -  
"boot"(a lengthBnumeric vector with KL divergence estimates on the bootstrap subsamples), only included ifinclude.boot = TRUE, 
Examples
# 1D Gaussian (one- and two-sample problems)
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
q <- function(x) dnorm(x, mean =1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_nn(X, Y = Y)
kld_est_nn(X, q = q)
kld_ci_subsampling(X, Y)$ci
kld_ci_subsampling(X, q = q)$ci
Analytical KL divergence for two discrete distributions
Description
Analytical KL divergence for two discrete distributions
Usage
kld_discrete(P, Q)
Arguments
P, Q | 
 Numerical arrays with the same dimensions, representing discrete probability distributions  | 
Value
A scalar (the Kullback-Leibler divergence)
Examples
# 1-D example
P <- 1:4/10
Q <- rep(0.25,4)
kld_discrete(P,Q)
# The above example in 2-D
P <- matrix(1:4/10,nrow=2)
Q <- matrix(0.25,nrow=2,ncol=2)
kld_discrete(P,Q)
Kullback-Leibler divergence estimator for discrete, continuous or mixed data.
Description
For two mixed continuous/discrete distributions with densities p and
q, and denoting x = (x_\text{c},x_\text{d}), the Kullback-Leibler
divergence D_{KL}(p||q) is given as
D_{KL}(p||q) = \sum_{x_d} \int p(x_c,x_d) \log\left(\frac{p(x_c,x_d)}{q(x_c,x_d)}\right)dx_c.
Conditioning on the discrete variables x_d, this can be re-written as
D_{KL}(p||q) = \sum_{x_d} p(x_d) D_{KL}\big(p(\cdot|x_d)||q(\cdot|x_d)\big) +
D_{KL}\big(p_{x_d}||q_{x_d}\big).
Here, the terms
D_{KL}\big(p(\cdot|x_d)||q(\cdot|x_d)\big)
are approximated via nearest neighbour- or kernel-based density estimates on
the datasets X and Y stratified by the discrete variables, and
D_{KL}\big(p_{x_d}||q_{x_d}\big)
is approximated using relative frequencies.
Usage
kld_est(
  X,
  Y = NULL,
  q = NULL,
  estimator.continuous = kld_est_nn,
  estimator.discrete = kld_est_discrete,
  vartype = NULL
)
Arguments
X, Y | 
 
  | 
q | 
 The density function of the approximate distribution   | 
estimator.continuous, estimator.discrete | 
 KL divergence estimators for
continuous and discrete data, respectively. Both are functions with two
arguments   | 
vartype | 
 A length   | 
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# 2D example, two samples
set.seed(0)
X <- data.frame(cont  = rnorm(10),
                discr = c(rep('a',4),rep('b',6)))
Y <- data.frame(cont  = c(rnorm(5), rnorm(5, sd = 2)),
                discr = c(rep('a',5),rep('b',5)))
kld_est(X, Y)
# 2D example, one sample
set.seed(0)
X <- data.frame(cont  = rnorm(10),
                discr = c(rep(0,4),rep(1,6)))
q <- list(cond = function(xc,xd) dnorm(xc, mean = xd, sd = 1),
          disc = function(xd) dbinom(xd, size = 1, prob = 0.5))
kld_est(X, q = q, vartype = c("c","d"))
Bias-reduced generalized k-nearest-neighbour KL divergence estimation
Description
This is the bias-reduced generalized k-NN based KL divergence estimator from Wang et al. (2009) specified in Eq.(29).
Usage
kld_est_brnn(X, Y, max.k = 100, warn.max.k = TRUE, eps = 0)
Arguments
X, Y | 
 
  | 
max.k | 
 Maximum numbers of nearest neighbours to compute (default:   | 
warn.max.k | 
 If   | 
eps | 
 Error bound in the nearest neighbour search. A value of   | 
Details
Finite sample bias reduction is achieved by an adaptive choice of the number
of nearest neighbours. Fixing the number of nearest neighbours upfront, as
done in kld_est_nn(), may result in very different distances
\rho^l_i,\nu^k_i of a datapoint x_i to its l-th nearest
neighbours in X and k-th nearest neighbours in Y,
respectively, which may lead to unequal biases in NN density estimation,
especially in a high-dimensional setting.
To overcome this issue, the number of neighbours l,k are here chosen
in a way to render \rho^l_i,\nu^k_i comparable, by taking the largest
possible number of neighbours l_i,k_i smaller than
\delta_i:=\max(\rho^1_i,\nu^1_i).
Since the bias reduction explicitly uses both samples X and Y, one-sample
estimation is not possible using this method.
Reference: Wang, Kulkarni and Verdú, "Divergence Estimation for Multidimensional Densities Via k-Nearest-Neighbor Distances", IEEE Transactions on Information Theory, Vol. 55, No. 5 (2009). DOI: https://doi.org/10.1109/TIT.2009.2016060
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# KL-D between one or two samples from 1-D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
q <- function(x) dnorm(x, mean = 1, sd =2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_nn(X, Y)
kld_est_nn(X, q = q)
kld_est_nn(X, Y, k = 5)
kld_est_nn(X, q = q, k = 5)
kld_est_brnn(X, Y)
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(100)
X2 <- rnorm(100)
Y1 <- rnorm(100)
Y2 <- Y1 + rnorm(100)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
             mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
kld_est_nn(X, Y)
kld_est_nn(X, Y, k = 5)
kld_est_brnn(X, Y)
Plug-in KL divergence estimator for samples from discrete distributions
Description
Plug-in KL divergence estimator for samples from discrete distributions
Usage
kld_est_discrete(X, Y = NULL, q = NULL)
Arguments
X, Y | 
 
  | 
q | 
 The probability mass function of the approximate distribution
  | 
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# 1D example, two samples
X <- c(rep('M',5),rep('F',5))
Y <- c(rep('M',6),rep('F',4))
kld_est_discrete(X, Y)
# 1D example, one sample
X <- c(rep(0,4),rep(1,6))
q <- function(x) dbinom(x, size = 1, prob = 0.5)
kld_est_discrete(X, q = q)
Kernel density-based Kullback-Leibler divergence estimation in any dimension
Description
Disclaimer: this function doesn't use binning and/or the fast Fourier transform and hence, it is extremely slow even for moderate datasets. For this reason, it is not exported currently.
Usage
kld_est_kde(X, Y, hX = NULL, hY = NULL, rule = c("Silverman", "Scott"))
Arguments
X, Y | 
 
  | 
hX, hY | 
 Positive scalars or length   | 
rule | 
 A heuristic for computing arguments  
 As an alternative, Scott's rule  
  | 
Details
This estimation method approximates the densities of the unknown distributions
P and Q by kernel density estimates, using a sample size- and
dimension-dependent bandwidth parameter and a Gaussian kernel. It works for
any number of dimensions but is very slow.
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# KL-D between two samples from 1-D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_kde1(X, Y)
kld_est_nn(X, Y)
kld_est_brnn(X, Y)
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(100)
X2 <- rnorm(100)
Y1 <- rnorm(100)
Y2 <- Y1 + rnorm(100)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
             mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
kld_est_kde2(X, Y)
kld_est_nn(X, Y)
kld_est_brnn(X, Y)
1-D kernel density-based estimation of Kullback-Leibler divergence
Description
This estimation method approximates the densities of the unknown distributions
P and Q by a kernel density estimate using function 'density' from
package 'stats'. Only the two-sample, not the one-sample problem is implemented.
Usage
kld_est_kde1(X, Y, MC = FALSE, ...)
Arguments
X, Y | 
 Numeric vectors or single-column matrices, representing samples
from the true distribution   | 
MC | 
 A boolean: use a Monte Carlo approximation instead of numerical
integration via the trapezoidal rule (default:   | 
... | 
 Further parameters to passed on to   | 
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# KL-D between two samples from 1D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_kde1(X,Y)
kld_est_kde1(X,Y, MC = TRUE)
2-D kernel density-based estimation of Kullback-Leibler divergence
Description
This estimation method approximates the densities of the unknown bivariate
distributions P and Q by kernel density estimates using function
'bkde' from package 'KernSmooth'. If 'KernSmooth' is not installed, a message
is issued and the (much) slower function 'kld_est_kde' is used instead.
Usage
kld_est_kde2(
  X,
  Y,
  MC = FALSE,
  hX = NULL,
  hY = NULL,
  rule = c("Silverman", "Scott"),
  eps = 1e-05
)
Arguments
X, Y | 
 
  | 
MC | 
 A boolean: use a Monte Carlo approximation instead of numerical
integration via the trapezoidal rule (default:   | 
hX, hY | 
 Bandwidths for the kernel density estimates of   | 
rule | 
 A heuristic to derive parameters  
  | 
eps | 
 A nonnegative scalar; if   | 
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(1000)
X2 <- rnorm(1000)
Y1 <- rnorm(1000)
Y2 <- Y1 + rnorm(1000)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
             mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
kld_est_kde2(X,Y)
k-nearest neighbour KL divergence estimator
Description
This function estimates Kullback-Leibler divergence D_{KL}(P||Q) between
two continuous distributions P and Q using nearest-neighbour (NN)
density estimation in a Monte Carlo approximation of D_{KL}(P||Q).
Usage
kld_est_nn(X, Y = NULL, q = NULL, k = 1L, eps = 0, log.q = FALSE)
Arguments
X, Y | 
 
  | 
q | 
 The density function of the approximate distribution   | 
k | 
 The number of nearest neighbours to consider for NN density estimation.
Larger values for   | 
eps | 
 Error bound in the nearest neighbour search. A value of   | 
log.q | 
 If   | 
Details
Input for estimation is a sample X from P and either the density
function q of Q (one-sample problem) or a sample Y of Q
(two-sample problem). In the two-sample problem, it is the estimator in Eq.(5)
of Wang et al. (2009). In the one-sample problem, the asymptotic bias (the
expectation of a Gamma distribution) is substracted, see Pérez-Cruz (2008),
Eq.(18).
References:
Wang, Kulkarni and Verdú, "Divergence Estimation for Multidimensional Densities Via k-Nearest-Neighbor Distances", IEEE Transactions on Information Theory, Vol. 55, No. 5 (2009).
Pérez-Cruz, "Kullback-Leibler Divergence Estimation of Continuous Distributions", IEEE International Symposium on Information Theory (2008).
Value
A scalar, the estimated Kullback-Leibler divergence \hat D_{KL}(P||Q).
Examples
# KL-D between one or two samples from 1-D Gaussians:
set.seed(0)
X <- rnorm(100)
Y <- rnorm(100, mean = 1, sd = 2)
q <- function(x) dnorm(x, mean = 1, sd =2)
kld_gaussian(mu1 = 0, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_est_nn(X, Y)
kld_est_nn(X, q = q)
kld_est_nn(X, Y, k = 5)
kld_est_nn(X, q = q, k = 5)
kld_est_brnn(X, Y)
# KL-D between two samples from 2-D Gaussians:
set.seed(0)
X1 <- rnorm(100)
X2 <- rnorm(100)
Y1 <- rnorm(100)
Y2 <- Y1 + rnorm(100)
X <- cbind(X1,X2)
Y <- cbind(Y1,Y2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
             mu2 = rep(0,2), sigma2 = matrix(c(1,1,1,2),nrow=2))
kld_est_nn(X, Y)
kld_est_nn(X, Y, k = 5)
kld_est_brnn(X, Y)
Analytical KL divergence for two univariate exponential distributions
Description
This function computes D_{KL}(p||q), where p\sim \text{Exp}(\lambda_1)
and q\sim \text{Exp}(\lambda_2), in rate parametrization.
Usage
kld_exponential(lambda1, lambda2)
Arguments
lambda1 | 
 A scalar (rate parameter of true exponential distribution)  | 
lambda2 | 
 A scalar (rate parameter of approximate exponential distribution)  | 
Value
A scalar (the Kullback-Leibler divergence)
Examples
kld_exponential(lambda1 = 1, lambda2 = 2)
Analytical KL divergence for two uni- or multivariate Gaussian distributions
Description
This function computes D_{KL}(p||q), where p\sim \mathcal{N}(\mu_1,\Sigma_1)
and q\sim \mathcal{N}(\mu_2,\Sigma_2).
Usage
kld_gaussian(mu1, sigma1, mu2, sigma2)
Arguments
mu1 | 
 A numeric vector (mean of true Gaussian)  | 
sigma1 | 
 A s.p.d. matrix (Covariance matrix of true Gaussian)  | 
mu2 | 
 A numeric vector (mean of approximate Gaussian)  | 
sigma2 | 
 A s.p.d. matrix (Covariance matrix of approximate Gaussian)  | 
Value
A scalar (the Kullback-Leibler divergence)
Examples
kld_gaussian(mu1 = 1, sigma1 = 1, mu2 = 1, sigma2 = 2^2)
kld_gaussian(mu1 = rep(0,2), sigma1 = diag(2),
                mu2 = rep(1,2), sigma2 = matrix(c(1,0.5,0.5,1), nrow = 2))
Analytical KL divergence for two uniform distributions
Description
This function computes D_{KL}(p||q), where p\sim \text{U}(a_1,b_1)
and q\sim \text{U}(a_2,b_2), with a_2<a_1<b_1<b_2.
Usage
kld_uniform(a1, b1, a2, b2)
Arguments
a1, b1 | 
 Range of true uniform distribution  | 
a2, b2 | 
 Range of approximate uniform distribution  | 
Value
A scalar (the Kullback-Leibler divergence)
Examples
kld_uniform(a1 = 0, b1 = 1, a2 = 0, b2 = 2)
Analytical KL divergence between a uniform and a Gaussian distribution
Description
This function computes D_{KL}(p||q), where p\sim \text{U}(a,b)
and q\sim \mathcal{N}(\mu,\sigma^2).
Usage
kld_uniform_gaussian(a = 0, b = 1, mu = 0, sigma2 = 1)
Arguments
a, b | 
 Parameters of uniform (true) distribution  | 
mu, sigma2 | 
 Parameters of Gaussian (approximate) distribution  | 
Value
A scalar (the Kullback-Leibler divergence)
Examples
kld_uniform_gaussian(a = 0, b = 1, mu = 0, sigma2 = 1)
Probability density function of multivariate Gaussian distribution
Description
Probability density function of multivariate Gaussian distribution
Usage
mvdnorm(x, mu, Sigma)
Arguments
x | 
 A vector of length   | 
mu | 
 A vector of length   | 
Sigma | 
 A   | 
Value
The probability density of N(\mu,\Sigma) evaluated at x.
Examples
# 1D example
mvdnorm(x = 2, mu = 1, Sigma = 2)
dnorm(x = 2, mean = 1, sd = sqrt(2))
# Independent 2D example
mvdnorm(x = c(2,2), mu = c(1,1), Sigma = diag(1:2))
prod(dnorm(x = c(2,2), mean = c(1,1), sd = sqrt(1:2)))
# Correlated 2D example
mvdnorm(x = c(2,2), mu = c(1,1), Sigma = matrix(c(2,1,1,2),nrow=2))
Transform samples to uniform scale
Description
Since Kullback-Leibler divergence is scale-invariant, its sample-based
approximations can be computed on a conveniently chosen scale. This helper
functions transforms each variable in a way that all marginal distributions
of the joint dataset (X,Y) are uniform. In this way, the scales of
different variables are rendered comparable, with the idea of a better
performance of neighbour-based methods in this situation.
Usage
to_uniform_scale(X, Y)
Arguments
X, Y | 
 
  | 
Value
A list with fields X and Y, containing the transformed samples.
Examples
# 2D example
n <- 10L
X <- cbind(rnorm(n, mean = 0, sd = 3),
           rnorm(n, mean = 1, sd = 2))
Y <- cbind(rnorm(n, mean = 1, sd = 2),
           rnorm(n, mean = 0, sd = 2))
to_uniform_scale(X, Y)
Matrix trace operator
Description
Matrix trace operator
Usage
tr(M)
Arguments
M | 
 A square matrix  | 
Value
The matrix trace (a scalar)
Trapezoidal integration in 1 or 2 dimensions
Description
Trapezoidal integration in 1 or 2 dimensions
Usage
trapz(h, fx)
Arguments
h | 
 A length   | 
fx | 
 A   | 
Value
The trapezoidal approximation of the integral.
Examples
# 1D example
trapz(h = 1, fx = 1:10)
# 2D example
trapz(h = c(1,1), fx = matrix(1:10, nrow = 2))