TransWikia.com

Logistic regression with lasso versus PCA?

Cross Validated Asked by Manas on December 25, 2021

Logistic regression with lasso versus PCA?

Got asked this question in an interview. I know the main difference is that Lasso is a regularization technique (adding vars to minimize effect of large coefficients) while PCA is feature selection technique (by covariance matrix decomposition).

I answered PCA allows you to do feature selection outside of the fit and transform and therefore give more flexibility in the hyper parameter search. Whereas in lasso the “feature selection” is kind of done for you and therefore there is less scope of hyper parameter optimization.

Does that sound right?

2 Answers

  • PCA while reducing the number of features does not care about the class labels. The only thing that it cares about is preserving the maximum variance which may not always be optimal for classification task.

  • L1-Reg on the other hand pushes those features towards zero that do not have much correlation with the class labels. Hence, L1-Reg strives to reduce the number of features while also getting good classification performance.

  • To avoid under-fitting, we can always do hyper-parameter tuning to find best lambda.

Answered by Hardik Vagadia on December 25, 2021

I answered PCA allows you to do feature selection outside of the fit and transform and therefore give more flexibility in the hyper parameter search.

PCA can be used as a dimensionality reduction technique if you drop Principal Components based on a heuristic, but it offers no feature selection, as the Principal Components are retained instead of the original features. However, tuning the number of Principal Components retained should work better than using heuristics, unless there are many low variance components and you are simply interested in filtering them.

Whereas in lasso the "feature selection" is kind of done for you and therefore there is less scope of hyper parameter optimization.

LASSO ($ell_1$ regularization) on the other hand can, intrinsically, perform feature selection as the coefficients of predictors are shrunk towards zero. It still requires hyperparameter tuning because there's a regularization coefficient that weights how severe is the regularization of the loss function.


As @MatthewDrury commented, ordinary PCA is agnostic to the target variable while LASSO regression isn't, as it's part of a regression model. This is the most important difference, actuallly.

Answered by Firebug on December 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP