principal {psych}  R Documentation 
Does an eigen value decomposition and returns eigen values, loadings, and degree of fit for a specified number of components. Basically it is just doing a principal components analysis (PCA) for n principal components of either a correlation or covariance matrix. Can show the residual correlations as well. The quality of reduction in the squared correlations is reported by comparing residual correlations to original correlations. Unlike princomp, this returns a subset of just the best nfactors. The eigen vectors are rescaled by the sqrt of the eigen values to produce the component loadings more typical in factor analysis.
principal(r, nfactors = 1, residuals = FALSE,rotate="varimax",n.obs=NA, covar=FALSE, scores=TRUE,missing=FALSE,impute="median",oblique.scores=TRUE,method="regression",...)
r 
a correlation matrix. If a raw data matrix is used, the correlations will be found using pairwise deletions for missing values. 
nfactors 
Number of components to extract 
residuals 
FALSE, do not show residuals, TRUE, report residuals 
rotate 
"none", "varimax", "quatimax", "promax", "oblimin", "simplimax", and "cluster" are possible rotations/transformations of the solution. 
n.obs 
Number of observations used to find the correlation matrix if using a correlation matrix. Used for finding the goodness of fit statistics. 
covar 
If false, find the correlation matrix from the raw data or convert to a correlation matrix if given a square matrix as input. 
scores 
If TRUE, find component scores 
missing 
if scores are TRUE, and missing=TRUE, then impute missing values using either the median or the mean 
impute 
"median" or "mean" values are used to replace missing values 
oblique.scores 
If TRUE (default), then the component scores are based upon the structure matrix. If FALSE, upon the pattern matrix. 
method 
Which way of finding component scores should be used. The default is "regression" 
... 
other parameters to pass to functions such as factor.scores 
Useful for those cases where the correlation matrix is improper (perhaps because of SAPA techniques).
There are a number of data reduction techniques including principal components analysis (PCA) and factor analysis (EFA). Both PC and FA attempt to approximate a given correlation or covariance matrix of rank n with matrix of lower rank (p). nRn = nFk kFn' + U2 where k is much less than n. For principal components, the item uniqueness is assumed to be zero and all elements of the correlation or covariance matrix are fitted. That is, nRn = nFk kFn' The primary empirical difference between a components versus a factor model is the treatment of the variances for each item. Philosophically, components are weighted composites of observed variables while in the factor model, variables are weighted composites of the factors.
For a n x n correlation matrix, the n principal components completely reproduce the correlation matrix. However, if just the first k principal components are extracted, this is the best k dimensional approximation of the matrix.
It is important to recognize that rotated principal components are not principal components (the axes associated with the eigen value decomposition) but are merely components. To point this out, unrotated principal components are labelled as PCi, while rotated PCs are now labeled as RCi (for rotated components) and obliquely transformed components as TCi (for transformed components). (Thanks to Ulrike Gromping for this suggestion.)
Rotations and transformations are either part of psych (Promax and cluster), of base R (varimax), or of GPArotation (simplimax, quartimax, oblimin).
Some of the statistics reported are more appropriate for (maximum likelihood) factor analysis rather than principal components analysis, and are reported to allow comparisons with these other models.
Although for items, it is typical to find component scores by scoring the salient items (using, e.g., score.items
) component scores are found by regression where the regression weights are R^(1) lambda where lambda is the matrix of component loadings. The regression approach is done to be parallel with the factor analysis function fa
. The regression weights are found from the inverse of the correlation matrix times the component loadings. This has the result that the component scores are standard scores (mean=0, sd = 1) of the standardized input. A comparison to the scores from princomp
shows this difference. princomp does not, by default, standardize the data matrix, nor are the components themselves standardized. By default, the regression weights are found from the Structure matrix, not the Pattern matrix.
Jolliffe (2002) discusses why the interpretation of rotated components is complicated. The approach used here is consistent with the factor analytic tradition. The correlations of the items with the component scores closely matches (as it should) the component loadings.
values 
Eigen Values of all components – useful for a scree plot 
rotation 
which rotation was requested? 
n.obs 
number of observations specified or found 
communality 
Communality estimates for each item. These are merely the sum of squared factor loadings for that item. 
loadings 
A standard loading matrix of class “loadings" 
fit 
Fit of the model to the correlation matrix 
fit.off 
how well are the off diagonal elements reproduced? 
residual 
Residual matrix – if requested 
dof 
Degrees of Freedom for this model. This is the number of observed correlations minus the number of independent parameters (number of items * number of factors  nf*(nf1)/2. That is, dof = niI * (ni1)/2  ni * nf + nf*(nf1)/2. 
objective 
value of the function that is minimized by maximum likelihood procedures. This is reported for comparison purposes and as a way to estimate chi square goodness of fit. The objective function is

STATISTIC 
If the number of observations is specified or found, this is a chi square based upon the objective function, f. Using the formula from 
PVAL 
If n.obs > 0, then what is the probability of observing a chisquare this large or larger? 
phi 
If oblique rotations (using oblimin from the GPArotation package) are requested, what is the interfactor correlation. 
scores 
If scores=TRUE, then estimates of the factor scores are reported 
weights 
The beta weights to find the principal components from the data 
R2 
The multiple R square between the factors and factor score estimates, if they were to be found. (From Grice, 2001) For components, these are of course 1.0. 
valid 
The correlations of the component score estimates with the components, if they were to be found and unit weights were used. (So called course coding). 
William Revelle
Grice, James W. (2001), Computing and evaluating factor scores. Psychological Methods, 6, 430450
Jolliffe, I. (2002) Principal Component Analysis (2nd ed). Springer.
Revelle, W. An introduction to psychometric theory with applications in R (in prep) Springer. Draft chapters available at http://personalityproject.org/r/book/
VSS
(to test for the number of components or factors to extract), VSS.scree
and fa.parallel
to show a scree plot and compare it with random resamplings of the data), factor2cluster
(for course coding keys), fa
(for factor analysis), factor.congruence
(to compare solutions), predict.psych
to find factor/component scores for a new data set based upon the weights from an original data set.
#Four principal components of the Harmon 24 variable problem #compare to a four factor principal axes solution using factor.congruence pc < principal(Harman74.cor$cov,4,rotate="varimax") mr < fa(Harman74.cor$cov,4,rotate="varimax") #minres factor analysis pa < fa(Harman74.cor$cov,4,rotate="varimax",fm="pa") # principal axis factor analysis round(factor.congruence(list(pc,mr,pa)),2) pc2 < principal(Harman.5,2,rotate="varimax",scores=TRUE) pc2 round(cor(Harman.5,pc2$scores),2) #compare these correlations to the loadings biplot(pc2,main="Biplot of the Harman.5 socioeconomic variables")