Principal Components Analysis(主成分分析法)R中的prcomp

来源:互联网 发布:淘宝情侣装有名的店铺 编辑:程序博客网 时间:2024/05/16 11:01

Principal Components Analysis(主成分分析法)

Description

Performs a principal components analysis on the given data matrix and returns the results as an object of class prcomp.(对给定的数据矩阵执行PCA分析,通过prcomp的一个对象返回结果)

Usage

prcomp(x, ...)## S3 method for class 'formula'prcomp(formula, data = NULL, subset, na.action, ...)## Default S3 method:prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,       tol = NULL, ...)## S3 method for class 'prcomp'predict(object, newdata, ...)

Arguments

formula

a formula with no response variable, referring only to numeric variables. 没有响应变量的公式,只与数值变量相关!

data

an optional data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).

一个可选的数据框架包含公式formula中的变量。默认情况下,该变量来自环境(即formula)

subset

an optional vector used to select rows (observations) of the data matrix x.一个可选的向量,用于从数据矩阵x中选择行

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The ‘factory-fresh’ default is na.omit.

...

arguments passed to or from other methods. If x is a formula one might specify scale. or tol.

x

a numeric or complex matrix (or data frame) which provides the data for the principal components analysis.为PCA分析提供的数值矩阵

retx

a logical value indicating whether the rotated variables should be returned.一个逻辑变量,指定是否返回旋转变量

center

a logical value indicating whether the variables should be shifted to be zero centered. Alternately, a vector of length equal the number of columns of x can be supplied. The value is passed to scale.逻辑变量

scale.

a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is FALSE for consistency with S, but in general scaling is advisable. Alternatively, a vector of length equal the number of columns of x can be supplied. The value is passed to scale.

逻辑变量,表明是在进行分析前是否要为单位方差调整比例。FALSE表示固定,通常比列是可调整的。

tol

a value indicating the magnitude below which components should be omitted. (Components are omitted if their standard deviations are less than or equal to tol times the standard deviation of the first component.) With the default null setting, no components are omitted. Other settings for tol could be tol = 0 or tol = sqrt(.Machine$double.eps), which would omit essentially constant components.该变量,指定指定最低的成分量级,低于该成分的量级的成将被忽略。默认值为null,将全部显示

object

Object of class inheriting from "prcomp"从prcomp继承的类对象

newdata

An optional data frame or matrix in which to look for variables with which to predict. If omitted, the scores are used. If the original fit used a formula or a data frame or a matrix with column names, newdata must contain columns with the same names. Otherwise it must contain the same number of columns, to be used in the same order.可选的数据帧或者矩阵,从该数据帧或矩阵中寻找能够预测的变量。

Details

The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. The print method for these objects prints the results in a nice format and the plot method produces a scree plot.

Unlike princomp, variances are computed with the usual divisor N - 1.

计算结果是对原矩阵通过奇异值分解,而不是使用协方差中特征。这通常是数值精确的首先方法。

Note that scale = TRUE cannot be used if there are zero or constant (for center = TRUE) variables.

Value

prcomp returns a list with class "prcomp" containing the following components:

sdev

the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix).

rotation

the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). The function princomp returns this in the element loadings.

x

if retx is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the rotation matrix) is returned. Hence, cov(x) is the diagonal matrix diag(sdev^2). For the formula method, napredict() is applied to handle the treatment of values omitted by the na.action.

center, scale

the centering and scaling used, or FALSE.

Note

The signs of the columns of the rotation matrix are arbitrary, and so may differ between different programs for PCA, and even between different builds of R.

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

Mardia, K. V., J. T. Kent, and J. M. Bibby (1979) Multivariate Analysis, London: Academic Press.

Venables, W. N. and B. D. Ripley (2002) Modern Applied Statistics with S, Springer-Verlag.

See Also

biplot.prcompscreeplotprincompcorcovsvdeigen.

Examples

## signs are randomrequire(graphics)## the variances of the variables in the## USArrests data vary by orders of magnitude, so scaling is appropriateprcomp(USArrests)  # inappropriateprcomp(USArrests, scale = TRUE)prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)plot(prcomp(USArrests))summary(prcomp(USArrests, scale = TRUE))biplot(prcomp(USArrests, scale = TRUE))
原创粉丝点击