Covariance- v. correlation-matrix based PCA

Mathematics Asked by Lucozade on December 5, 2020

In principal component analysis (PCA), one can choose either the covariance matrix or the correlation matrix to find the components. These give different results because, I suspect, the eigenvectors between both matrices are not equal. (Mathematically) similar matrices have the same eigenvalues, but not necessarily the same eigenvectors. Several questions: (1) Why this difference? (2) Does PCA make sense, if you can get two different answers? (3) Which of the two methods is ‘best’? (4) Since PCA operates on standardized (not) raw data in both cases, i.e., scaled by their standard deviation, does it make sense to use the results to draw conclusions about the dominance of variation for the actual, unstandardized data?

One Answer

The problem with not standardizing, i.e. with not scaling the variables by their standard deviation, is that if, for example, one variable is measured in centimeters and another in dollars, then changing centimeters to meters can actually change the eigenvectors, so an arbitrary choice of units can alter the results. Hence I'd use the correlation matrix.

Answered by Michael Hardy on December 5, 2020

Add your own answers!

Related Questions

How to compute $sum_{n=1}^infty{frac{n}{(2n+1)!}}$?

3  Asked on September 18, 2020 by samuel-a-morales


Rational Roots (with Lots of Square Roots!)

1  Asked on September 16, 2020 by fleccerd


Is a relation that is purely reflexive also symmetric?

1  Asked on September 13, 2020 by paul-j


$3^{123} mod 100$

4  Asked on September 13, 2020 by global05


Totally order turing machines by halting

1  Asked on September 10, 2020 by donald-hobson


Density of Tensor Products

0  Asked on September 8, 2020 by jacob-denson


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP