Manuele Bicego | TU Delft Repository

Detecting outliers from pairwise proximities

Proximity isolation forests

Journal article (2023) - Antonella Mensi (author), David M. J. Tax (author), David M.J. Tax (author), D.M.J. Tax (author), David Tax (author), Manuele Bicego (author)

Because outliers are very different from the rest of the data, it is natural to represent outliers by their distances to other objects. Furthermore, there are many scenarios in which only pairwise distances are known, and feature-based outlier detection methods cannot directly be ...

Also for k-means

More data does not imply better performance

Journal article (2023) - M. Loog (author), M. Loog (author), J.H. Krijthe (author), J. H. Krijthe (author), Jesse H. Krijthe (author), Jesse H. Krijthe (author), J. H. Krijthe (author), J.H. Krijthe (author), Manuele Bicego (author)

Arguably, a desirable feature of a learner is that its performance gets better with an increasing amount of training data, at least in expectation. This issue has received renewed attention in recent years and some curious and surprising findings have been reported on. In essence ...

An Alternative Exploitation of Isolation Forests for Outlier Detection

Conference paper (2021) - Antonella Mensi (author), Alessio Franzoni (author), D.M.J. Tax (author), David Tax (author), David M. J. Tax (author), David M.J. Tax (author), Manuele Bicego (author)

Isolation Forests are one of the most successful outlier detection techniques: they isolate outliers by performing random splits in each node. It has been recently shown that a trained Random Forest-based model can also be used to define and extract informative distance measures ...

A dissimilarity-based multiple instance learning approach for protein remote homology detection

Journal article (2019) - Antonella Mensi (author), Manuele Bicego (author), Pietro Lovato (author), M. Loog (author), D.M.J. Tax (author), David M. J. Tax (author), David Tax (author), David M.J. Tax (author)

We study the problem of Protein Remote Homology Detection, which assesses the functional similarity of two proteins. We approach this as a problem of binary multiple-instance learning (MIL) that aims to distinguish between homologous and non-homologous proteins. The particular MI ...