Predicting with bayesian models and especially BUGS is very easy. Just set the response in testing sets to NA. Then you also need to specifiy initial values for the response; set those to NA for the training set and to a reasonable value for the test data.
BUGS will then sample from the posterior predictive distribution for the response values you set to NA. Note that these distributions contain the uncertainty about the regression coefficients. You can take the median of these samples if you want point estimates, but the sd of the estimates will also be quite informative.
Here is a rather minimal example:
model
{
for (i in 1:N)
{
y[i] ~ dnorm(mu,1)
}
mu ~ dunif(-1000,1000)
}
#data
list(N=10, y = c(-1,0,1,-1,0.5,-0.5,2,-1.5, NA, NA))
#inits
list(mu = 0, y = c(NA,NA,NA,NA,NA,NA,NA,NA,0,0))
You can then get posterior predictive distributions for $y_9$ and $y_{10}$. This example does not contain predictors, but it also works with them. Note that you would not set them to NA, they would instead remain unchanged.
@Edit after Comment:
You can also do this differently and seperate test and training data in the model above. This would look like this:
model
{
for (i in 1:N.train)
{
y.train[i] ~ dnorm(mu,1)
}
for (i in 1:N.test)
{
y.test[i] ~ dnorm(mu,1)
}
mu ~ dunif(-1000,1000)
}
#data
list(N.train=8, N.test = 2, y.train = c(-1,0,1,-1,0.5,-0.5,2,-1.5))
#inits
list(mu = 0, y.test = c(0,0))
This might look somewhat easier, but note that you will also need to split any predictor in the models (my example has none). You might have vectors like sex.train and sex.test then. Personally I prefer the first way, because it is more terse.
And yes, I think this is a reasonable starting point. While some sorts of overfitting will be indicated in a bayesian model by very high sds for the coefficients, you still impose a model structure which might not fit the data well. This can also lead to your predictions being poor. You should also consider (for example) a full cross validation, where you will repeat that step with different splits of the original data.
Best Answer
The logistic regression is not "semi-parametric". It has only parametric component. For parametric model, the number of parameters is fixed and does not depend on the number of training data, but only depends on the model itself. This is true for logistic regression since if you have $n$ variables $X_1,\ldots,X_n$ you have $n+1$ parameters $w_0,\ldots,w_n$ to define the logistic regression model, and the number of these parameters does not increase or decrease based on the number of training data. Note that for non-parametric models you also have parameters, but the number of parameters is not fixed and depends on the number of training examples.