Covariance- v. correlation-matrix based PCA

Mathematics Asked by Lucozade on December 5, 2020

In principal component analysis (PCA), one can choose either the covariance matrix or the correlation matrix to find the components. These give different results because, I suspect, the eigenvectors between both matrices are not equal. (Mathematically) similar matrices have the same eigenvalues, but not necessarily the same eigenvectors. Several questions: (1) Why this difference? (2) Does PCA make sense, if you can get two different answers? (3) Which of the two methods is ‘best’? (4) Since PCA operates on standardized (not) raw data in both cases, i.e., scaled by their standard deviation, does it make sense to use the results to draw conclusions about the dominance of variation for the actual, unstandardized data?

One Answer

The problem with not standardizing, i.e. with not scaling the variables by their standard deviation, is that if, for example, one variable is measured in centimeters and another in dollars, then changing centimeters to meters can actually change the eigenvectors, so an arbitrary choice of units can alter the results. Hence I'd use the correlation matrix.

Answered by Michael Hardy on December 5, 2020

Add your own answers!

Related Questions

How do I use only NAND operators to express OR, NOT, and AND?

0  Asked on November 29, 2020 by user831636


For how many values of n, will P(n) be false?

2  Asked on November 29, 2020 by asad-ahmad


For every set exists another stronger set

0  Asked on November 28, 2020 by 45465


Which of the following statements is correct?

1  Asked on November 27, 2020 by user469754


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir