do.spmds transfers the classical multidimensional scaling problem into the data spectral domain using Laplace-Beltrami operator. Its flexibility to use subsamples and spectral interpolation of non-reference data enables relatively efficient computation for large-scale data.

do.spmds(
  X,
  ndim = 2,
  neigs = max(2, nrow(X)/10),
  ratio = 0.1,
  preprocess = c("null", "center", "scale", "cscale", "decorrelate", "whiten"),
  type = c("proportion", 0.1),
  symmetric = c("union", "intersect", "asymmetric")
)

Arguments

X

an \((n\times p)\) matrix or data frame whose rows are observations and columns represent independent variables.

ndim

an integer-valued target dimension.

neigs

number of eigenvectors to be used as spectral dimension.

ratio

percentage of subsamples as reference points.

preprocess

an additional option for preprocessing the data. Default is "null". See also aux.preprocess for more details.

type

a vector of neighborhood graph construction. Following types are supported; c("knn",k), c("enn",radius), and c("proportion",ratio). Default is c("proportion",0.1), connecting about 1/10 of nearest data points among all data points. See also aux.graphnbd for more details.

symmetric

one of "intersect", "union" or "asymmetric" is supported. Default is "union". See also aux.graphnbd for more details.

Value

a named list containing

Y

an \((n\times ndim)\) matrix whose rows are embedded observations.

trfinfo

a list containing information for out-of-sample prediction.

References

Aflalo Y, Kimmel R (2013). “Spectral Multidimensional Scaling.” Proceedings of the National Academy of Sciences, 110(45), 18052--18057.

Author

Kisung You

Examples

if (FALSE) {
## Replicate the numerical example from the paper
#  Data Preparation
set.seed(100)
dim.true  = 3     # true dimension
dim.embed = 100   # embedding space (high-d)
npoints   = 1000  # number of samples to be generated

v     = matrix(runif(dim.embed*dim.true),ncol=dim.embed)
coeff = matrix(runif(dim.true*npoints),  ncol=dim.true)
X     = coeff%*%v

# see the effect of neighborhood size
out1  = do.spmds(X, neigs=100, type=c("proportion",0.10))
out2  = do.spmds(X, neigs=100, type=c("proportion",0.25))
out3  = do.spmds(X, neigs=100, type=c("proportion",0.50))

# visualize the results
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, main="10% neighborhood")
plot(out2$Y, main="25% neighborhood")
plot(out3$Y, main="50% neighborhood")
par(opar)
}