TransWikia.com

Validation of my implementation for trace norm and L1 regularization

Cross Validated Asked by rando on February 13, 2021

I wrote a code for trace norm and L1 Regularizations:

$$argmin_{A} ||AX – B||_F^2 + g(A), g(A)={||A||_1^1, ||A||_*}$$,

where $||A||_*$ is the trace norm of $A$. I followed iterative soft thresholding algorithm for L1 regularization and this paper for trace norm regularization (I implemented APG algorithm in page 12). I have two questions regarding L1 and trace norm respectively:

  1. I noticed that for L1 implementation that the gradient always goes up and down, these consoles’ output below shows what I mean. G is the Frobenius norm of the gradient. Is that normal? I believe this happens because the algorithm reaches a minima and after that the gradient goes up and down? An important observation is when $lambda = 0$ the gradient always increases!
G =  0.8187491879972199
L1 Gradient is increasing at iteration  754
G =  0.8187491879972197
L1 Gradient is increasing at iteration  757
G =  0.8187491879972197
L1 Gradient is increasing at iteration  762
G =  0.8187491879972197
L1 Gradient is increasing at iteration  778
G =  0.8187491879972197

  1. The same observation holds for the trace norm, however, I was expecting a more robust behavior since there is a stopping criteria (which I didn’t implement in L1 regularization). That stopping criteria defined as (the details are in page 16 and 17):

enter image description here

The authors say that Tol should be "moderately small", which I don’t know what reasonable values to set. Is it 1e-4for instance? Anyways, although I used that stopping criteria, I still see the same increase and decrease of the gradient as shown below. Is that normal?

Gradient is increasing at iteration  82
G =  0.8346248363494398
Gradient is increasing at iteration  83
G =  0.83489958527063
Gradient is increasing at iteration  84
G =  0.8350311313694062
Gradient is increasing at iteration  105
G =  0.8251464433484178
Gradient is increasing at iteration  106
G =  0.8252529405067198
Gradient is increasing at iteration  107
G =  0.8254498017879796

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP