Sparse PCA (do.spca) is a variant of PCA in that each loading - or, principal
component - should be sparse. Instead of using generic optimization package,
we opt for formulating a problem as semidefinite relaxation and utilizing ADMM.
do.spca(X, ndim = 2, mu = 1, rho = 1, ...)an \((n\times p)\) matrix whose rows are observations and columns represent independent variables.
an integer-valued target dimension.
an augmented Lagrangian parameter.
a regularization parameter for sparsity.
extra parameters including
maximum number of iterations (default: 100).
absolute tolerance stopping criterion (default: 1e-8).
relative tolerance stopping criterion (default: 1e-4).
a named Rdimtools S3 object containing
an \((n\times ndim)\) matrix whose rows are embedded observations.
a \((p\times ndim)\) whose columns are basis for projection.
name of the algorithm.
Zou H, Hastie T, Tibshirani R (2006). “Sparse Principal Component Analysis.” Journal of Computational and Graphical Statistics, 15(2), 265–286.
d'Aspremont A, El Ghaoui L, Jordan MI, Lanckriet GRG (2007). “A Direct Formulation for Sparse PCA Using Semidefinite Programming.” SIAM Review, 49(3), 434–448.
Ma S (2013). “Alternating Direction Method of Multipliers for Sparse Principal Component Analysis.” Journal of the Operations Research Society of China, 1(2), 253–274.
# \donttest{
## use iris data
data(iris, package="Rdimtools")
set.seed(100)
subid = sample(1:150,50)
X = as.matrix(iris[subid,1:4])
lab = as.factor(iris[subid,5])
## try different regularization parameters for sparsity
out1 <- do.spca(X,ndim=2,rho=0.01)
out2 <- do.spca(X,ndim=2,rho=1)
out3 <- do.spca(X,ndim=2,rho=100)
## visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(out1$Y, col=lab, pch=19, main="SPCA::rho=0.01")
plot(out2$Y, col=lab, pch=19, main="SPCA::rho=1")
plot(out3$Y, col=lab, pch=19, main="SPCA::rho=100")
par(opar)
# }