Solved – Principal component analysis result

machine learningpcarstandard deviation

I'm trying to understand the result of PCA, thought you can help me to understand better.

us.pca1 <- prcomp(USArrests)
us.pca1$sdev
[1] 83.732400 14.212402  6.489426  2.482790

Here I see the standard deviation for the variables Murder, Assault, UrbanPop and Rape are 83, 14, 6 and 2 respectively. But when I use the scale=T option I'm getting different SD.

us.pca2 <- prcomp(USArrests,scale=T)
us.pca2$sdev
[1] 1.5748783 0.9948694 0.5971291 0.4164494

From the Help, I know that scale=T is used, so that variables should be scaled to have unit variance before the analysis takes place. But does this actually means???
By the way, if I want to calculate SD in usual way, I'm getting different result.

sd(USArrests$Murder)
[1] 4.35551

Can someone help me what are these three different SD indicates!

Another question regarding the actual result in $roration matrix.

us.pca2$rotation
           PC1        PC2        PC3         PC4
Murder   -0.5358995  0.4181809 -0.3412327  0.64922780
Assault  -0.5831836  0.1879856 -0.2681484 -0.74340748
UrbanPop -0.2781909 -0.8728062 -0.3780158  0.13387773
Rape     -0.5434321 -0.1673186  0.8177779  0.08902432

Are these Eigenvalues or some percentage? What should I conclude from this result? Any help or link for further reading will be appreciated.

Best Answer

1. What is scaling

Scaling refers to techniques such as standardization and normalization which change the scale of your data. In this case, it specifically refers to standardization. More specifically, it sounds like Z-score scaling.

This means the variables recalculated as (V - mean of V)/s, where "s" is the standard deviation. As a result, all variables in the data set have equal means (0) and standard deviations (1) but different ranges.

I would type out the equation for you, but I'm bad with LaTex/MathJax, so please see the links below.

http://www.benetzkorn.com/2011/11/data-normalization-and-standardization/

http://www.ats.ucla.edu/Stat/stata/faq/standardize.htm

What's the difference between Normalization and Standardization?

2. What are these 3 different Standard Deviations

They were:

A. The raw standard deviation of the (first 4) principal components.

Note that singular values of the data matrix are equal to the square roots of the eigenvalues of the covariance matrix, up to a scaling factor sqrt(N-1) where N is the number of data points.

B. The scaled standard deviation of the (first 4) principal components (or, more precisely, the standard deviations of the (first 4) principal components created on scaled data).

C. The raw standard deviation of the variable Murder

Why scale? Well, there are lots of reasons, in addition to making the data easier to quickly analyze for some people. Another reason is that in methods involving gradient descent or other iterative solvers it leads to quicker convergence.

Related Question