Variable Selection in Logistic Regression Model
-
Graphical Abstract
-
Abstract
Variable selection is one of the most important problems in pattern recognition. In linear regression model, there are many methods can solve this problem, such as Least absolute shrinkage and selection operator (LASSO) and many improved LASSO methods, but there are few variable selection methods in generalized linear models. We study the variable selection problem in logistic regression model. We propose a new variable selection method-the logistic elastic net, prove that it has grouping effect which means that the strongly correlated predictors tend to be in or out of the model together. The logistic elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the LASSO is not a very satisfactory variable selection method in the case when p is more larger than n. The advantage and effectiveness of this method are demonstrated by real leukemia data and a simulation study.
-
-