I try to utilize Generalized Lasso `genlasso {genlasso}`

function to incorporate additional penalty matrix into estimation process.

I started with a "hello world" example: **I put a diagonal matrix as a penalty matrix in a model**. I realized the predictions I got from the model are very far from what I expected:

- predictions are negative whereas my response variable is positive.

Regular Lasso from `glmnet`

model gives me at least reasonable (positive) predictions, so my question is: **what do I miss in function call or what I do not understand that I end up with such unintuitive predictions from genlasso {genlasso} model**.

Reproducible example.

Step. 1. Simulate data based on covariance and mean estimates from my real data (correct simulation result for negative response variable values).

```
# Read objects describing simulation structure
sim.cov <- dget(url("https://raw.githubusercontent.com/martakarass/so_q/master/genlasso/sim_cov"))
sim.means <- dget(url("https://raw.githubusercontent.com/martakarass/so_q/master/genlasso/sim_means"))
# Simulate X (explanatory variables) and Y (response) matrices
# from multivariate normal
library(MASS)
set.seed(1)
sim.sample <- mvrnorm(n = 150, mu = sim.means, Sigma = sim.cov)
sim.sample.x <- sim.sample[, 1:66]
sim.sample.y <- sim.sample[, 67]
# Correct for negative response variable value
sim.sample.y[sim.sample.y < 0]
# [1] -0.8409024 -1.8470264 -1.8503388
sim.sample.y[sim.sample.y < 0] <- abs(sim.sample.y[sim.sample.y < 0])
summary(sim.sample.y)
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 0.01549 7.39000 10.93000 10.72000 13.58000 24.77000
```

Step 2. Simulate train and test subset indices

```
train.idc <- as.logical(rbinom(n = nrow(sim.sample), size = 1, prob = 0.8))
test.idc <- !train.idc
```

Step 3. Build regular Lasso model. Predict values with the use of best lambda value from cross-validation.

```
# LASSO
# Cross-validate on a train set
library(glmnet)
cv.glmnet.out <- cv.glmnet(x = sim.sample.x[train.idc, ],
y = sim.sample.y[train.idc],
nfolds = 10, type.measure = "mse", alpha = 1)
lambda.cv <- cv.glmnet.out$lambda.min
model.lasso <- glmnet(sim.sample.x[train.idc, ], sim.sample.y[train.idc],
alpha = 1)
model.lasso.predicted <- predict(model.lasso, s = lambda.cv,
newx = sim.sample.x[test.idc, ])
# Investigate predictions from a model
summary(as.vector(model.lasso.predicted))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# 3.412 9.235 10.850 10.960 12.760 17.950
```

Step 4. Build Generalized Lasso model. For convenience, use same best lambda as in regular LASSO model.

```
# LASSO `genlasso`: diagonal penalty matrix
p <- ncol(sim.sample.x)
library(genlasso)
genlasso.diag.out <- genlasso(y = sim.sample.y[train.idc],
X = sim.sample.x[train.idc, ],
D = diag(1, p))
model.genlasso.diag.predicted <- predict(genlasso.diag.out, lambda=lambda.cv,
Xnew = sim.sample.x[test.idc, ])$fit
```

-> investigate predicted values, which are negative

```
summary(as.vector(model.genlasso.diag.predicted))
# Min. 1st Qu. Median Mean 3rd Qu. Max.
# -27.2800 -6.0070 1.2580 0.3049 6.2230 24.4700
```

## Best Answer

There are two issues:

`glmnet`

includes an intercept by default while`genlasso`

does not. To disable the intercept in`glmnet`

, use`intercept=FALSE`

; it will reduce the performance significantly. To instead add an intercept in`genlasso`

,`cbind`

a column of ones onto your`X`

matrix, and change`D`

accordingly. If you choose to penalize it, you might need to set the corresponding value in the`p+1`

by`p+1`

`D`

matrix more carefully than in the example code below. By setting`penalize.intercept`

to`FALSE`

in the code below, it is not penalized by using a`p`

by`p+1`

penalty matrix`D`

which is an identity matrix with the row corresponding to the intercept cut off.`glmnet`

and`genlasso`

are on different scales. Note that`glmnet`

optimizes $\frac{1}{2N}\sum_{i=1}^N(y-\beta_0-x_i^T\beta)^2+\lambda\left[(1-\alpha)\left\|\beta\right\|_2^2+\alpha\left\|\beta\right\|_1\right]$, while`genlasso`

optimizes $\frac12\left\|y-X\beta\right\|_2^2+\lambda\left\|D\beta\right\|_1$; both use the $L^1$ norm for the lasso penalty, but`glmnet`

's error term is equivalent to the $L^2$ norm divided by $2N$, while`genlasso`

's formulation divides by just $2$. The $\lambda$ values from one must be rescaled to be applied to another. This is also an issue when using the CV'd $\lambda$ values for fitting a model with significantly different $N$, which is ignored in the code below. We can either scale`lambda.cv`

to the appropriate value for`genlasso`

, or use cross-validation to select an appropriate value directly. The code below takes the latter approach. Note that`glmnet`

provides two methods of selecting the`lambda`

value from cross-validation: the one giving the minimum CV loss (`lambda.min`

), and one from the 1-standard-error rule, which adds some more regularization in exchange for a bit higher CV loss (`lambda.1se`

); the 1se rule produces better out-of-sample performance in this case.Code for

`glmnet`

without intercept,`genlasso`

with intercept and CV-selected`lambda`

: