A dissimilarity-based multiple instance learning approach for protein remote homology detection

More Info
expand_more

Abstract

We study the problem of Protein Remote Homology Detection, which assesses the functional similarity of two proteins. We approach this as a problem of binary multiple-instance learning (MIL) that aims to distinguish between homologous and non-homologous proteins. The particular MIL approach employed is based on the dissimilarity representation in which various schemes of combining N-gram representations are considered. This approach allows us to cope with longer N-grams, capturing a richer biological context, and results in versatile framework offering competitive performance compared to state of the art.

Files

Final_027.pdf
(pdf | 0.661 Mb)
Unknown license

Download not available