My home page
Biography
Research
Publications
Courses
Personal
Research

Piotr Juszczak
I am interested in a wide range of theoretical and practical issues in statistical pattern recognition, probabilistic models and especially one-class classification problems and active learning

One-class classification

When in a classification problem only samples of one class are easily accessible, this problem is called a one-class classification problem. Many standard classifiers, like back-propagation neural networks, fail on this data. Some other techniques, like the k-means clustering or the nearest neighbor classifier can be applied after some minor changes.
In the problem of one-class classification, one class of the data, called the target set, has to be distinguished from all the other possible objects, called outliers. This description should be constructed such that objects not originating from the target set are not accepted by the data description. It is assumed that almost no examples of the outlier class are available.

In general, the problem of one-class classification is harder than the problem of normal two-class classification. For normal classification the decision boundary is supported from both sides by examples of each of the classes. Because in the case of one-class classification only one set of data is easily available, only one side of the boundary is covered. On the basis of one class it is hard to decide how tight the boundary should fit around the data in each of the directions.
The absence of example outlier objects makes it also very hard to estimate the error that the classifier makes. The error of the first kind - the target objects that are classified as outlier objects, can be estimated on the training set. The error of the second kind - the outlier objects that will be classified as target objects, can be estimated only by an assumption on the distribution of the outliers in the evaluation set. As long as we do not have example outlier objects available, we assume that the outliers are uniformly distributed in the feature space. This directly means, that when the chance of accepting an outlier object is minimized, the volume covered by the one-class classifier in the feature space should be minimized.
Using the uniform distribution for the outlier objects, implicitly assumes that the objects are represented by 'good' features. This means that outlier objects will be around the target class and not inside it. When it appears that there is still some overlap between the target objects and outlier objects, the representation of the objects should be changed such that the distinction becomes easier.






Click to go to NWO Click to go to ICT Click to go to TUDelft's Webpage
home | biography | research
papers | courses | personal