Solve Linear Least Squares with Squared $ {L}_{2} $ Norm Regularization (Tikhonov / Ridge Regression) with Non Negativity Constraint Using FASTA

convex optimizationleast squaresoptimization

The open source optimization library FASTA (Fast Adaptive Shrinkage/Thresholding Algorithm) is an efficient, easy-to-use implementation of the proximal gradient method for regularized optimization problems. Embeded solvers could solve LASSO, Ridge Regression and Matrix Complementry problems.

I want to solve the following problem using FASTA:
$$\arg \min_x \{||Ax-b||^2+\lambda||x||^2 \} \\ s.t.\quad x\ge 0$$

Acordding to the follwing LASSO solver(in MATLAB):

%%  Define ingredients for FASTA
%  Note: LASSO solver solves min f(Ax)+g(x). 
%  A and b are given.
%  f(z) = .5 ||z - b||^2
f    = @(z) .5*norm(z-b,'fro')^2;
grad = @(z) z-b;
% g(z) = 0 for all feasible z, and infinity otherwise.  However, iterations
% always end with a feasbile z, so we need not consider the infinity case.
g = @(x) 0;
% proxg(z,t) = argmin t*mu*|x|+.5||x-z||^2
prox = @(z,t) project_1ball(z,mu);

%% Call solver
[solution, outs] = fasta(A,At,f,grad,g,prox,x0,opts);

I write the following nonnegative constraned Ridge Regression Solver:

%%  Define ingredients for FASTA
%  Note: fasta solves min f(Ax)+g(x).
%  f(z) = .5 ||z - b||^2
f    = @(z) .5*norm(z-b,'fro')^2;
grad = @(z) z-b;

g = @(x) 0.5*mu*norm(x,'fro')^2;
prox = @(x,t) mapping(x,t*mu);

%% Call solver
[solution, outs] = fasta(A,At,f,grad,g,prox,x0,opts);

%% Embeded function
function x = mapping(x,tau)
    x = x+tau;
    x(x<0) = 0;  % for non-negative constraint
end

But it did not converge.
How to solve the NON-NEGATIVE constrained ridge regression problem using FASTA?

Best Answer

The FASTA package basically uses some heuristic to better utilize acceleration, momentum methods and line search methods.

Their paper, A Field Guide to Forward Backward Splitting with a FASTA Implementation, describes the method. For background I'd also recommend reading:

I implemented 4 methods for your problem to benchmark different approaches:

  • Vanilla Projected Gradient Descent.
  • Projected Gradient Descent with Momentum.
  • Projected Gradient Descent Accelerated (Nesterov).
  • Projected Gradient Descent Accelerated (FISTA).

Those are the results I got with comparison to a reference given by CVX:

enter image description here

The MATLAB code is available at my StackExchange Mathematics Q3872982 GitHub Repository.

Related Question