Logistic – Is Logistic Regression a Semi-Parametric Model?

generalized linear modellogisticnonparametric

I recently see a term "semi-parametric" in the answers of my question but not really understand what is this term means.

Wikipedia says

In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components.

And gives Cox proportional hazards model as an example.

I see Cox proportional hazards model and logistic regression are very similar, why we say one is semi-parametric but not say another?

BTW I found this answer, says GLM is not a semi-parametric model.

Best Answer

The logistic regression is not "semi-parametric". It has only parametric component. For parametric model, the number of parameters is fixed and does not depend on the number of training data, but only depends on the model itself. This is true for logistic regression since if you have $n$ variables $X_1,\ldots,X_n$ you have $n+1$ parameters $w_0,\ldots,w_n$ to define the logistic regression model, and the number of these parameters does not increase or decrease based on the number of training data. Note that for non-parametric models you also have parameters, but the number of parameters is not fixed and depends on the number of training examples.

Related Solutions

Solved – Logistic Regression – Bayesian Approach – Assessing Classification Accuracy

Predicting with bayesian models and especially BUGS is very easy. Just set the response in testing sets to NA. Then you also need to specifiy initial values for the response; set those to NA for the training set and to a reasonable value for the test data.

BUGS will then sample from the posterior predictive distribution for the response values you set to NA. Note that these distributions contain the uncertainty about the regression coefficients. You can take the median of these samples if you want point estimates, but the sd of the estimates will also be quite informative.

Here is a rather minimal example:

model
{
    for (i in 1:N)
        {
            y[i] ~ dnorm(mu,1)
        }
        mu ~ dunif(-1000,1000)
}

#data
list(N=10, y = c(-1,0,1,-1,0.5,-0.5,2,-1.5, NA, NA))
#inits
list(mu = 0, y = c(NA,NA,NA,NA,NA,NA,NA,NA,0,0))

You can then get posterior predictive distributions for $y_9$ and $y_{10}$. This example does not contain predictors, but it also works with them. Note that you would not set them to NA, they would instead remain unchanged.

@Edit after Comment:

You can also do this differently and seperate test and training data in the model above. This would look like this:

model
{
    for (i in 1:N.train)
        {
            y.train[i] ~ dnorm(mu,1)
        }
    for (i in 1:N.test)
        {
            y.test[i] ~ dnorm(mu,1)
        }
        mu ~ dunif(-1000,1000)
}

#data
list(N.train=8, N.test = 2, y.train = c(-1,0,1,-1,0.5,-0.5,2,-1.5))
#inits
list(mu = 0, y.test = c(0,0))

This might look somewhat easier, but note that you will also need to split any predictor in the models (my example has none). You might have vectors like sex.train and sex.test then. Personally I prefer the first way, because it is more terse.

And yes, I think this is a reasonable starting point. While some sorts of overfitting will be indicated in a bayesian model by very high sds for the coefficients, you still impose a model structure which might not fit the data well. This can also lead to your predictions being poor. You should also consider (for example) a full cross validation, where you will repeat that step with different splits of the original data.

Solved – Why is Pearson parametric and Spearman non-parametric

The problem is that "nonparametric" really has two distinct meanings these days. The definition in Wikipedia applies to things like nonparametric curve fitting, eg via splines or local regression. The other meaning, which is older, is more along the lines of "distribution-free" -- that is, techniques that can be applied regardless of the assumed distribution of the data. The latter is the one that applies to Spearman's rho, since the rank-transformation implies it will give the same result no matter what your original distribution was.

Best Answer

Related Solutions

Solved – Logistic Regression – Bayesian Approach – Assessing Classification Accuracy

Solved – Why is Pearson parametric and Spearman non-parametric

Related Question