Hello everyone.
I have started using the classification learner app and I have some questions I would like to ask. I will use Matlab's ovarian cancer data-set as an example to illustrate my issues.
1) In the case where we might be missing the response class for an observation (e.g. if response type was coming from histology and histology was not performed for the specific observation, but the predictors'data is available), is it preferable to set the missing observation's response to another, extra, class (e.g. 'unknown') or is it better not to use the observation at all?
2) When enabling PCA to reduce the dimensionality of the observations (in the ovarian cancer data-set, PCA reduces the number of predictors from 4000 to 215 and is using 7/215 features), can we know which features (obs in the ovarian cancer data-set) are the ones that PCA has kept?
3) When exporting a trained model to make predictions for new data and PCA was used dung training, what extra arguments do we need to use when calling: newPredictions = myExportedModel.predictFcn(newData) to ensure that the function knows that PCA was used during training myExportedModel?
Many thanks in advance for your help!
Regards, Ioannis
Best Answer