Many classification algorithms already do exactly what you're looking for, but often present their answers to users in the form of a binary (or n-way) judgement. For example, SVMLight is an implementation of the support vector machine classification algorithm; people commonly use this to make binary judgments on some data set. What happens under the hood, however, is the algorithm is assigning signed confidence judgments to the data. These are bound between -1.0 and 1.0 and are what you should use for ranking your data!
Since you know that your unknown objects in your data set are not of the three classes, as you know they are past face photos of unauthorized users, you can just treat them as a separate class. Since that fourth class is less defined as the first three and would contain very different objects, you should expect it to be scattered in a complex form over your input feature space.
Therefore, you need a classifier that can model complex non-linear decision boundaries. SVMs can do that. Unfortunately, SVMs do only binary classification and you have multiple classes. But do you? You could also see your problem as a two step procedure:
- Decide if the object is unknown or known
- If it is known, decide which of the three it is
It is not guaranteed that this will work better, but it's worth a shot. Step 1 would be binary which is better for SVMs. The ensembles to extend them to multiple classes come with problems. Step two is only necessary if the different authorized users have different types of access or if you need to log who was there at what time. If you only need an access/no access decision, no need for step two. For step two you could still try SVM ensembles or something else.
You should try a couple of algorithms (including those that have a two step procedure and those that don't) on the same cross validation folds and decide.
Your application scenario also can tell you whether it is a good idea or not to aggregate allowed users into one class:
- If you have only very few users that rarely if ever change, it is more practical to once aggregate them into a class and not have to retrain the classifier all the time
- If you have a long list of authorized users and correspondingly often new ones added to the list and old ones subtracted, it would not be practical to put them all in a class and retrain the entire classifier every time a user changes (though some techniques like $k$NN aren't prohibitive in that regard)
- You could do multiple single class classifications: user 1 or not? user 2 or not? etc. With 3 users that will work just fine. With 300 it would be dangerous because you inflate your chances of allowing an intruder. The user scans once and your system does 300 tests (without the user seeing there are so many tests). Each of these 300 tests would have a non-zero probability of falsely admitting someone who is not that particular user. So an intruder has 300 chances of being falsely admitted. If you raise all your 300 cutoffs in the 300 tests to counterbalance this problem, you raise the chances that authorized users will have to scan multiple times.
Regarding the higher class prevalence of object 1, that can be problematic for most algorithms (more for some than for others), but at factor 4 to 1, it shouldn't be a major concern yet.
What you should do however is take misclassification costs into consideration. If you know how much more costly it is to misclassify in one direction rather than the other and if you have an estimate of the probability of the classes, you can base your decision on this.
Best Answer
People tend to use DHOG (Dense Histograms of Orientations of Gradient), which is closely related to SIFT and other image description method for this purpose. There is another commonly used image description method called SURF.
You can take a look at VLFeat.