Spectral Multidimensional Scaling — do.spmds • Rdimtools

do.spmds transfers the classical multidimensional scaling problem into the data spectral domain using Laplace-Beltrami operator. Its flexibility to use subsamples and spectral interpolation of non-reference data enables relatively efficient computation for large-scale data.

do.spmds(
  X,
  ndim = 2,
  neigs = max(2, nrow(X)/10),
  ratio = 0.1,
  preprocess = c("null", "center", "scale", "cscale", "decorrelate", "whiten"),
  type = c("proportion", 0.1),
  symmetric = c("union", "intersect", "asymmetric")
)

Arguments

X: an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.
ndim: an integer-valued target dimension.
neigs: number of eigenvectors to be used as spectral dimension.
ratio: percentage of subsamples as reference points.
preprocess: an additional option for preprocessing the data. Default is "null". See also aux.preprocess for more details.
type: a vector of neighborhood graph construction. Following types are supported; c("knn",k), c("enn",radius), and c("proportion",ratio). Default is c("proportion",0.1), connecting about 1/10 of nearest data points among all data points. See also aux.graphnbd for more details.
symmetric: one of "intersect", "union" or "asymmetric" is supported. Default is "union". See also aux.graphnbd for more details.

Value

a named list containing

Y: an \((n\times ndim)\) matrix whose rows are embedded observations.
trfinfo: a list containing information for out-of-sample prediction.

References

Aflalo Y, Kimmel R (2013). “Spectral Multidimensional Scaling.” Proceedings of the National Academy of Sciences, 110(45), 18052--18057.

Author

Kisung You

Examples

if (FALSE) {
## Replicate the numerical example from the paper
#  Data Preparation
set.seed(100)
dim.true  = 3     # true dimension
dim.embed = 100   # embedding space (high-d)
npoints   = 1000  # number of samples to be generated

v     = matrix(runif(dim.embed*dim.true),ncol=dim.embed)
coeff = matrix(runif(dim.true*npoints),  ncol=dim.true)
X     = coeff%*%v

# see the effect of neighborhood size
out1  = do.spmds(X, neigs=100, type=c("proportion",0.10))
out2  = do.spmds(X, neigs=100, type=c("proportion",0.25))
out3  = do.spmds(X, neigs=100, type=c("proportion",0.50))

# visualize the results
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, main="10% neighborhood")
plot(out2$Y, main="25% neighborhood")
plot(out3$Y, main="50% neighborhood")
par(opar)
}