I have a logistic model fitted with the following R function:
glmfit<-glm(formula, data, family=binomial)
A reasonable cutoff value in order to get a good data classification (or confusion matrix) with the fitted model is 0.2 instead of the mostly used 0.5.
And I want to use the cv.glm
function with the fitted model:
cv.glm(data, glmfit, cost, K)
Since the response in the fitted model is a binary variable an appropriate cost function is (obtained from "Examples" section of ?cv.glm):
cost <- function(r, pi = 0) mean(abs(r-pi) > 0.5)
As I have a cutoff value of 0.2, can I apply this standard cost function or should I define a different one and how?
Thank you very much in advance.
Best Answer
OK, No answers to my post. But I think I got the answer. All credits go to @Feng Mai. He wrote a post here: What is the cost function in cv.glm in R's boot package? and thanks to it here is my answer to my question:
For a cutoff value of 0.2, I think that I could I apply the following cost function:
And then I would use the
cv.glm
function with the fitted model and mycost function:Hopefully this might work. Am I right?