SPLS-package {SPLS} | R Documentation |
SPLS is used to perform the sparse partial least-squares regression. Here are two approaches (SPLS and SPLS2).
SPLS(X, Y, penalty="HL", nc=1, lambda=0.01) SPLS2(X, Y, penalty="HL", nc=1, lambda=0.01)
X |
n-by-p data matrix of p predictors measured on n samples. |
Y |
n-by-q multivariate response data matrix with q variables from the same n samples. |
penalty |
"HL" is the unbounded penalty proposed by Lee and Oh (2009) and "L1" is the L1 penalty. |
nc |
Number of latent components |
lambda |
Tuning parameter for the sparsity. |
SPLS and SPLS2 are new formulations of the sparse PLS (SPLS) procedure to allow both sparse variable selection and dimension reduction. These methods allow the standard L1-penalty and the unbounded penalty of Lee and Oh (2009). The computing algorithm for splsHL and spls2 is the modified version of the nonlinear iterative partial least-squares (NIPALS) algorithm.
W |
A matrix with the direciton vectors with respect to original predictors, X |
R |
A matrix with the direciton vectors with respect to residual matrix |
T |
Latent component matrix |
beta |
Estimates of regreesion coefficients |
lambda |
Tuning parameter |
Donghwan Lee, Woojoo Lee, Youngjo Lee and Yudi Pawitan
Maintainer: Woojoo Lee <lwj221@gmail.com> and Donghwan Lee <liebe02@snu.ac.kr>
Lee, D., Lee, W., Lee, Y. and Pawitan, Y. (2011). Sparse partial least-squares regression and its applications to high-throughput data analysis, Chemometrics and Intelligent Laboratory Systems, 109, 1-8.
## Generation of X and Y n<-40; p<-100 ; q<-10; var.h<-25; nsr<-0.1; p1<-5 set.seed(12345) err.x<-rnorm(n*p, mean=0, sd=sqrt(1)) err.y<-rnorm(n*q, mean=0, sd=sqrt(nsr*25*var.h)) X<-matrix(err.x,n,p);X<-scale(X,T,F) h1<-rnorm(n,mean=0, sd=sqrt(var.h));h1<-c(scale(h1,T,F)) h2<-rnorm(n,mean=0, sd=sqrt(var.h));h2<-c(scale(h2,T,F)) h3<-rnorm(n,mean=0, sd=sqrt(var.h));h3<-c(scale(h3,T,F)) X[,1:p1]<-X[,1:p1]+h1 X[,(p1+1):(2*p1)]<-X[,(p1+1):(2*p1)]+h2 X[,(2*p1+1):p]<-X[,(2*p1+1):p]+h3 Y<-matrix(err.y,n,q);Y<-scale(Y,T,F) Y<-Y+3*h1-4*h2 ## SPLS approach ## 1. Find the optimal tuning parameter using 10-fold Cross validation splsHL.cv<-cv.SPLS(X,Y, penalty="HL", fold = 10, nc=c(1:3), lambda=seq(0.01,0.1,,10), plot.it = FALSE) ## 2. splsHL with optimal tuning parameters splsHL.opt<-SPLS(X, Y, penalty="HL", nc=splsHL.cv$nc.opt, lambda=splsHL.cv$lambda.opt) ## Estimates of regression coefficients splsHL.opt$beta ## SPLS2 approach ## 1. Find the optimal tuning parameter using 10-fold Cross validation spls2.cv<-cv.SPLS2(X,Y, penalty="HL", fold = 10, nc=c(1:3), lambda=seq(1,10,,10), plot.it = FALSE) ## 2. splsHL with optimal tuning parameters spls2.opt<-SPLS2(X, Y, penalty="HL", nc=spls2.cv$nc.opt, lambda=spls2.cv$lambda.opt) ## Estimates of regression coefficients spls2.opt$beta