Logistic Regression – Why Logistic Regression Sometimes Outperforms Neural Networks

ginilogisticneural networksregression

I have 5 samples (each one contains ~380K records, 33 predictive variables and 1 binary Target):

one sample is used to train the models
the remaining 4 samples are used to validate the models

The following table compares the Gini's of the Logistic Regression against the Gini's of the Multilayer Perceptron (MLP) :

	Logistic Regression	MLP
Train sample	35.8	34.9
validation sample 1	40.0	34.4
validation sample 2	37.7	32.0
validation sample 3	37.5	31.5
validation sample 4	36.4	34.2

As you can see, the Gini's of the Logistic Regression are consistently higher than the Gini's of the MLP.

Why could that be?

Before running both the Logistic Regression and the MLP I have categorized the categorical variables and also scaled the numeric variables.

The code of the Logistic Regression is really simple and straightfoward:

Y=data['Target']  # this is the target
X=data[col_list]  # this is the list of 33 predictive features


X1=sm.add_constant(X)
   
logit=sm.Logit(Y,X1)
result=logit.fit()
print(result.summary())

The code of the MLP is this one:

def build_model():
    model = Sequential()
    model.add(Dense(5, input_dim=33, activation='relu'))
    model.add(Dense(5, activation='sigmoid'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

   
model = build_model()
model.fit(X, Y, epochs=4, batch_size=30, verbose=1)  # X=predictive features ; Y = target

I don't understand why the MLP underperforms the Logistic Regression.

Best Answer

If the response conditioned on your predictive roughly follows a logistic curve then logistic regression will be superior. Despite ML hype, DL/NN do not always outperform simpler models.

Have you examined the output of the logistic model? What do the residuals look like? I believe this implementation automatically includes an L2 penalty which probably helps given the number of predictors.

Not sure what categorized the categorical variable means. If it's something like one-hot encoding that is not recommended for regression, the model implementation should automatically handle it.

Best Answer

Related Solutions

Solved – Verifying neural network model performance

Solved – the difference between logistic regression and neural networks

Related Question