TransWikia.com

Prove identity matrix with singular value decomposition

Mathematics Asked by Hojas on November 26, 2021

Let $I_d$ denote the $d times d$ identity matrix and for $X in mathbb{R}^{mtimes n}$ and $lambda>0$ I have to prove the following:

$$(X^TX+lambda I_n)^{-1}X^T=X^T(XX^T+lambda I_m)^{-1}$$

I think about using Singular Value Decomposition of $X$, because of the $X^TX$ term, so it holds that:

$$X = USigma V^T$$

But I dont know how to applied correctly.

Thanks in advance.

One Answer

$newcommand{diag}{operatorname{diag}}$You are on the right track using SVD. Just need to carry some annoying algebra: $$ begin{align} (X^TX+lambda I_n)^{-1}X^T&=X^T(XX^T+lambda I_m)^{-1}\ iff (VSigma^T U^T USigma V^T + lambda I_n)^{-1} VSigma^T U^T &= VSigma^T U^T (USigma V^T VSigma^T U^T + lambda I_m)^{-1}\ iff (V(Sigma^TSigma + lambda I_n)V^T)^{-1} VSigma^T U^T &= VSigma^T U^T (U(SigmaSigma^T + lambda I_m) U^T)^{-1}\ iff V(Sigma^TSigma + lambda I_n)^{-1}V^T VSigma^T U^T &= VSigma^T U^T U(SigmaSigma^T + lambda I_m)^{-1} U^T\ iff V(Sigma^TSigma + lambda I_n)^{-1}Sigma^T U^T &= VSigma^T (SigmaSigma^T + lambda I_m)^{-1} U^T\ end{align} $$ Now all that is left is to show that $(Sigma^TSigma + lambda I_n)^{-1}Sigma^T=Sigma^T (SigmaSigma^T + lambda I_m)^{-1}$. Let us explicitly calculate these matrices:

Without loss of generality assume $m < n$, then $Sigma^TSigma = diag(sigma_1^2,dots,sigma_m^2, 0,0,dots)$ and $SigmaSigma^T = diag(sigma_1^2,dots,sigma_m^2)$. Using the fact that inverse of a diagonal matrix is a diagonal matrix with it's diagonal elements inverted, we have $$(Sigma^TSigma + lambda I_n)^{-1}= diag^{-1}(sigma_1^2+lambda,dots,sigma_m^2+lambda, lambda,lambda,dots) = diag((sigma_1^2+lambda)^{-1},dots,(sigma_m^2+lambda)^{-1}, lambda^{-1},lambda^{-1},dots).$$ Multiplying on left by the $ntimes m$ diagonal matrix $Sigma^T$, we have $$(Sigma^TSigma + lambda I_n)^{-1}Sigma^T= diag_{ntimes m}(sigma_1(sigma_1^2+lambda)^{-1},dots,sigma_m(sigma_m^2+lambda)^{-1}).$$ Note that the extra $lambda^{-1}$s got killed by the zero rows of $Sigma^T$.

Similarly for the right hand side: $$(SigmaSigma^T + lambda I_m)^{-1} = diag^{-1}(sigma_1^2+lambda,dots,sigma_m^2+lambda) = diag((sigma_1^2+lambda)^{-1},dots,(sigma_m^2+lambda)^{-1}).$$ Multiplying on the left by the $n times m$ diagonal matrix $Sigma^T$ we have: $$Sigma^T(SigmaSigma^T + lambda I_m)^{-1} = diag_{ntimes m}(sigma_1(sigma_1^2+lambda)^{-1},dots,sigma_m(sigma_m^2+lambda)^{-1}).$$ This concludes the proof.

Answered by stochastic on November 26, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP