Wasserstein Distance Estimation with Boostrapping
wassboot.RdThis function computes the \(\mathcal{W}_p\) distance between two empirical measures using bootstrap in order to quantify the uncertainty of the estimation.
Arguments
- X
an \((M\times P)\) matrix of row observations.
- Y
an \((N\times P)\) matrix of row observations.
- p
an exponent for the order of the distance (default: 2).
- B
number of bootstrap samples (default: 500).
- wx
a length-\(M\) marginal density that sums to \(1\). If
NULL(default), uniform weight is set.- wy
a length-\(N\) marginal density that sums to \(1\). If
NULL(default), uniform weight is set.
Value
a named list containing
- distance
\(\mathcal{W}_p\) distance value.
- boot_samples
a length-\(B\) vector of bootstrap samples.
Examples
# \donttest{
#-------------------------------------------------------------------
# Boostrapping Wasserstein Distance between Two Bivariate Normals
#
# * class 1 : samples from Gaussian with mean=(-5, 0)
# * class 2 : samples from Gaussian with mean=(+5, 0)
#-------------------------------------------------------------------
## SMALL EXAMPLE
m = round(runif(1, min=50, max=100))
n = round(runif(1, min=50, max=100))
X = matrix(rnorm(m*2), ncol=2) # m obs. for X
Y = matrix(rnorm(n*2), ncol=2) # n obs. for Y
X[,1] = X[,1] - 5
Y[,1] = Y[,1] + 5
## COMPUTE THE BOOTSTRAP SAMPLES
boots = wassboot(X, Y, B=1000)
## VISUALIZE
opar <- par(no.readonly=TRUE)
hist(boots$boot_samples, xlab="Estimates", main="Bootstrap Samples")
abline(v=boots$distance, lwd=2, col="blue")
abline(v=mean(boots$boot_samples), lwd=2, col="red")
abline(v=10, col="cyan", lwd=2)
legend("topright", c("ground truth","estimate","bootstrap mean"),
col=c("cyan","blue","red"), lwd=2)
par(opar)
# }