I'm trying to understand the result of PCA, thought you can help me to understand better.
us.pca1 <- prcomp(USArrests)
us.pca1$sdev
[1] 83.732400 14.212402 6.489426 2.482790
Here I see the standard deviation for the variables Murder, Assault, UrbanPop and Rape are 83, 14, 6 and 2 respectively. But when I use the scale=T option I'm getting different SD.
us.pca2 <- prcomp(USArrests,scale=T)
us.pca2$sdev
[1] 1.5748783 0.9948694 0.5971291 0.4164494
From the Help, I know that scale=T is used, so that variables should be scaled to have unit variance before the analysis takes place. But does this actually means???
By the way, if I want to calculate SD in usual way, I'm getting different result.
sd(USArrests$Murder)
[1] 4.35551
Can someone help me what are these three different SD indicates!
Another question regarding the actual result in $roration matrix.
us.pca2$rotation
PC1 PC2 PC3 PC4
Murder -0.5358995 0.4181809 -0.3412327 0.64922780
Assault -0.5831836 0.1879856 -0.2681484 -0.74340748
UrbanPop -0.2781909 -0.8728062 -0.3780158 0.13387773
Rape -0.5434321 -0.1673186 0.8177779 0.08902432
Are these Eigenvalues or some percentage? What should I conclude from this result? Any help or link for further reading will be appreciated.
Best Answer
1. What is scaling
Scaling refers to techniques such as standardization and normalization which change the scale of your data. In this case, it specifically refers to standardization. More specifically, it sounds like Z-score scaling.
This means the variables recalculated as (V - mean of V)/s, where "s" is the standard deviation. As a result, all variables in the data set have equal means (0) and standard deviations (1) but different ranges.
I would type out the equation for you, but I'm bad with LaTex/MathJax, so please see the links below.
http://www.benetzkorn.com/2011/11/data-normalization-and-standardization/
http://www.ats.ucla.edu/Stat/stata/faq/standardize.htm
What's the difference between Normalization and Standardization?
2. What are these 3 different Standard Deviations
They were:
A. The raw standard deviation of the (first 4) principal components.
Note that singular values of the data matrix are equal to the square roots of the eigenvalues of the covariance matrix, up to a scaling factor
sqrt(N-1)
whereN
is the number of data points.B. The scaled standard deviation of the (first 4) principal components (or, more precisely, the standard deviations of the (first 4) principal components created on scaled data).
C. The raw standard deviation of the variable
Murder
Why scale? Well, there are lots of reasons, in addition to making the data easier to quickly analyze for some people. Another reason is that in methods involving gradient descent or other iterative solvers it leads to quicker convergence.