Face clustering is a subfield of computer vision and pattern recognition with many applications such as face recognition and surveillance. Accurate clustering of faces can also help us to create labeled datasets. However, in the domain of comics, face clustering is not well studi
...
Face clustering is a subfield of computer vision and pattern recognition with many applications such as face recognition and surveillance. Accurate clustering of faces can also help us to create labeled datasets. However, in the domain of comics, face clustering is not well studied. Therefore, it is uncertain which methods of feature extraction and clustering perform well on faces of comic characters. In this paper, we investigate the effectiveness of comic face clustering. To conduct our investigation, we implement two pipelines: one that automatically extracts character faces from comic strips, and another that clusters the extracted faces. Using Dilbert Comics for our experiments, we examine the performance of various feature extraction and clustering methods. Additionally, we experiment with combining feature extraction methods and removing noisy samples to increase the clustering accuracy. We show that using color information is crucial for accurate clustering, and combining color with shape features further improves accuracy. However, our experiments indicate that accuracy improvement is not guaranteed for every combination of feature extraction methods. We also demonstrate that removing noisy samples using hierarchical clustering can increase clustering precision. Using our findings, we achieve an F1 score of 0.752 based on our Dilbert Comics dataset of 77,768 face images. We obtain this result by clustering 20,988 non-noisy face images into 35 clusters with a precision of 0.886.