JE Griffin and PJ Brown
Bayesian adaptive lassos with non-convex penalization
Date: February 2007
Abstract: The lasso (Tibshirani,1996) has sparked interest in the use of penalization of the loglikelihood for variable selection, as well as shrinkage. Recently, there have been attempts to propose penalty functions which improve upon the Lassos properties for variable selection and prediction, such as SCAD (Fan and Li, 2001) and the Adaptive Lasso (Zou, 2006). We adopt the Bayesian interpretation of the Lasso as the MAP estimate of the regression coefficients, which have been given independent, double exponential prior distributions. Generalizing this prior provides a family of adaptive lasso penalty functions, which includes the quasi-cauchy distribution (Johnstone and Silverman, 2005) as a special case. The properties of this approach are explored. We are particularly interested in the more variables than observations case of characteristic importance for data arising in chemometrics, genomics and proteomics - to name but three. Our methodology can give rise to multiple modes of the posterior distribution and we show how this may occur even with the convex lasso. These multiple modes do no more than reflect the indeterminacy of the model. We give fast algorithms and suggest a strategy of using a set of perfectly fitting random starting values to explore different regions of the parameter space with substantial posterior support. Simulations show that our procedure provides significant improvements on a range of established procedures and we provide an example from chemometrics.
Keywords: Bayesian Variable selection in regression, Scale mixtures of normals, Normal Exponential Gamma, adaptive lasso, Penalized likelihood, non-convexity.