Cross Validated Asked on January 7, 2022
I calculated PCs for my samples and I am showing here data frame that has samples as my rows and PCs as my columns. My question is in order to decide on the number of PCs to keep for my regression analysis is this valid approach?
> head(a)
PC1 PC2 PC3 PC4 PC5 PC6 PC7
1 -13.0692 3.825460 -2.8089500 -0.120865 -9.53690 2.2582600 0.975514
2 -13.0419 4.076040 -2.3597900 2.326170 -0.73101 -1.5689400 1.642810
3 -9.5570 4.270540 -0.9153700 -0.160893 -2.27807 -1.0854500 -0.551797
4 -11.4407 0.716765 -0.0932982 -1.229210 2.56851 -0.0708945 2.841000
5 -15.0062 6.971110 -2.9324700 -3.033660 -3.73211 1.8029200 0.712720
6 -13.8156 1.667130 -1.2647800 3.929120 4.12255 0.2541560 1.119040
PC8 PC9 PC10
1 -2.220460 1.15324 3.677270
2 -2.552010 -2.57720 0.111892
3 0.360637 0.30142 -1.288880
4 1.391550 -5.13552 -1.975630
5 1.937330 -1.83419 -1.462170
6 -0.637011 -3.15796 -1.238350
...
a.cov <- cov(a)
a.eigen <- eigen(a.cov)
PVE <- a.eigen$values / sum(a.eigen$values)
> PVE
[1] 0.49967626 0.22981763 0.07138644 0.04307668 0.03680999 0.02830493
[7] 0.02526709 0.02384502 0.02135397 0.02046199
So it seems that the first 4 PCs explain about 85% of my variance. Is this the valid way on how to go abotu deciding the number of PCs to keep?
Yes, typically this is a good way to select how many principal components to include in your model.
It could help to visualize the eigenvalues as well. Plot them from highest to lowest and find the point where the curve flattens out (so that later eigenvalues make less impact on the information content)
Answered by phil on January 7, 2022
1 Asked on January 14, 2021 by doxav
multiarmed bandit optimization queueing real time time series
1 Asked on January 14, 2021
1 Asked on January 14, 2021 by user261225
1 Asked on January 13, 2021 by katy
0 Asked on January 13, 2021
2 Asked on January 13, 2021 by crazydriver
1 Asked on January 13, 2021 by vin
bayesian bayesian network conditional probability inference posterior
0 Asked on January 12, 2021 by g-s-luimstra
convolution filter gradient descent neural networks optimization
1 Asked on January 12, 2021 by somethingsomething
1 Asked on January 12, 2021 by user3136
1 Asked on January 11, 2021 by ad-van-der-ven
bernoulli distribution distributions exponential distribution probability
0 Asked on January 11, 2021 by jonas-palaionis
0 Asked on January 11, 2021 by sventon
2 Asked on January 11, 2021 by benjamin-phua
1 Asked on January 10, 2021 by confusedmathstudent
conditional expectation conditional probability expected value frequency severity
0 Asked on January 10, 2021
2 Asked on January 10, 2021 by stats_nerd
feature engineering feature selection machine learning regression
0 Asked on January 10, 2021 by anto_zoolander
2 Asked on January 10, 2021 by snoopy
Get help from others!
Recent Questions
Recent Answers
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir