A New Baseline for Feature Description on Multimodal Imaging of Paintings
More Info
expand_more
Abstract
Multimodal imaging is used by conservators and scientists to study the composition of paintings. To aid the combined analysis of these digitisations, such images must first be aligned. Rather than proposing a new domain-specific descriptor, we explore and evaluate how existing feature descriptors from related fields can improve the performance of feature-based painting digitisation registration. We benchmark these descriptors on pixel-precise, manually aligned digitisations of ''Girl with a Pearl Earring'' by Johannes Vermeer (c. 1665, Mauritshuis) and of ''18th-Century Portrait of a Woman''. As a baseline we compare against the well-established classical SIFT descriptor. We consider two recent descriptors: the handcrafted multimodal MFD descriptor, and the learned unimodal SuperPoint descriptor. Experiments show that SuperPoint starkly increases description matching accuracy by 40% for modalities with little modality-specific artefacts. Further, performing craquelure segmentation and using the MFD descriptor results in significant description matching accuracy improvements for modalities with many modalityspecific artefacts.