The interaction between classification and rejection performance for distance-based reject-option classifiers

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Consider the class of problems in which a target class is well-defined, and an outlier class is ill-defined. In these cases new outlier classes can appear, or the class-conditional distribution of the outlier class itself may be poorly sampled. A strategy to deal with this problem involves a two-stage classifier, in which one stage is designed to perform discrimination between known classes, and the other stage encloses known data to protect against changing conditions. The two stages are, however, interrelated, implying that optimising one may compromise the other. In this paper the relation between the two stages is studied within an ROC analysis framework. We show how the operating characteristics can be used for both model selection, and in aiding in the choice of the reject threshold. An analytic study on a controlled experiment is performed, followed by some experiments on real-world datasets with the distance-based reject-option classifier.

Keywords: Ill-defined classification problems; Unseen classes; Reject-option; Model selection; ROC analysis