Cross Validated Asked on January 7, 2022
I calculated PCs for my samples and I am showing here data frame that has samples as my rows and PCs as my columns. My question is in order to decide on the number of PCs to keep for my regression analysis is this valid approach?
> head(a)
PC1 PC2 PC3 PC4 PC5 PC6 PC7
1 -13.0692 3.825460 -2.8089500 -0.120865 -9.53690 2.2582600 0.975514
2 -13.0419 4.076040 -2.3597900 2.326170 -0.73101 -1.5689400 1.642810
3 -9.5570 4.270540 -0.9153700 -0.160893 -2.27807 -1.0854500 -0.551797
4 -11.4407 0.716765 -0.0932982 -1.229210 2.56851 -0.0708945 2.841000
5 -15.0062 6.971110 -2.9324700 -3.033660 -3.73211 1.8029200 0.712720
6 -13.8156 1.667130 -1.2647800 3.929120 4.12255 0.2541560 1.119040
PC8 PC9 PC10
1 -2.220460 1.15324 3.677270
2 -2.552010 -2.57720 0.111892
3 0.360637 0.30142 -1.288880
4 1.391550 -5.13552 -1.975630
5 1.937330 -1.83419 -1.462170
6 -0.637011 -3.15796 -1.238350
...
a.cov <- cov(a)
a.eigen <- eigen(a.cov)
PVE <- a.eigen$values / sum(a.eigen$values)
> PVE
[1] 0.49967626 0.22981763 0.07138644 0.04307668 0.03680999 0.02830493
[7] 0.02526709 0.02384502 0.02135397 0.02046199
So it seems that the first 4 PCs explain about 85% of my variance. Is this the valid way on how to go abotu deciding the number of PCs to keep?
Yes, typically this is a good way to select how many principal components to include in your model.
It could help to visualize the eigenvalues as well. Plot them from highest to lowest and find the point where the curve flattens out (so that later eigenvalues make less impact on the information content)
Answered by phil on January 7, 2022
0 Asked on January 21, 2021 by atilla
1 Asked on January 21, 2021 by alajeb
1 Asked on January 21, 2021 by basketballautomation
1 Asked on January 21, 2021 by funkwecker
0 Asked on January 21, 2021
0 Asked on January 20, 2021 by igor-f
1 Asked on January 19, 2021 by wetlabstudent
1 Asked on January 19, 2021 by raghavsikaria
0 Asked on January 18, 2021 by ladan-gol
0 Asked on January 18, 2021 by thomas-moore
0 Asked on January 18, 2021 by shawn-strasser
1 Asked on January 17, 2021 by matthias
0 Asked on January 17, 2021 by cat-cuddler
case control study hypothesis testing inference observational study panel data
0 Asked on January 16, 2021 by ss-varshini
0 Asked on January 16, 2021 by sgg
gradient descent machine learning mathematical statistics risk training error
0 Asked on January 16, 2021 by adam-pollack
3 Asked on January 16, 2021 by sorcererofdm
gradient descent machine learning neural networks optimization pattern recognition
5 Asked on January 15, 2021 by aristide-herve
0 Asked on January 14, 2021 by mat
experiment design fractional factorial multivariate analysis random allocation
Get help from others!
Recent Answers
Recent Questions
© 2023 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP