Solved – Elastic net arbitrary alpha selection

cross-validationelastic netfeature selectionmachine learningsmall-sample

I'm trying to solve a prediction problem given the following constraints:

  1. I need an interpretable model to be used for experimental validation
  2. I need a model that performs feature selection to reduce ~20000 features to ~100
  3. I need the model to retain the correlated features and not simply select one of them arbitrarily
  4. I want to model to perform well with a small number of samples (worst case ~50 samples)

Lasso performs the feature selection, but elastic net allows the addition of the Ridge regression term to enable selecting the correlated variables instead of choosing one arbitrarily. So I believe this is the best model for this case.

I would select alpha using cross validation within the cross validation to select lambda. But constraint 4 (low number of samples) is the bottleneck for double cross validation. This leads me to want to select alpha = 0.5 arbitrarily and not optimize for the best one.

Would it make sense? Is it justified? Or in other words, when this work is going to be published, would it be criticizable as an arbitrary choice? And if so, what are my options?

Best Answer

The answer to a similar question here advises to follow the glmnet vignette (assuming you're using R):

foldid=sample(1:10,size=length(y),replace=TRUE)
cv1=cv.glmnet(x,y,foldid=foldid,alpha=1)
cv.5=cv.glmnet(x,y,foldid=foldid,alpha=.5)
cv0=cv.glmnet(x,y,foldid=foldid,alpha=0)

Keep the foldid fixed and assess a grid of $\alpha$ values using cross-validation for $\lambda$.

Related Question