Crowdsourcing is a popular technique to collect large amounts of human-generated labels, such as relevance judgments used to create information retrieval (IR) evaluation collections. Previous research has shown how collecting high quality labels from a crowdsourcing platform can
...
Crowdsourcing is a popular technique to collect large amounts of human-generated labels, such as relevance judgments used to create information retrieval (IR) evaluation collections. Previous research has shown how collecting high quality labels from a crowdsourcing platform can be challenging. Existing quality assurance techniques focus on answer aggregation or on the use of gold questions where ground-truth data allows to check for the quality of the responses. In this paper, we present qualitative and quantitative results, revealing how different crowd workers adopt different work strate- gies to complete relevance judgment tasks efficiently and their consequent impact on quality. We delve into the techniques and tools that highly experienced crowd workers use to be more effi- cient in completing crowdsourcing micro-tasks. To this end, we use both qualitative results from worker interviews and surveys, as well as the results of a data-driven study of behavioral log data (i.e., clicks, keystrokes and keyboard shortcuts) collected from crowd workers performing relevance judgment tasks. Our results high- light the presence of frequently used shortcut patterns that can speed-up task completion, thus increasing the hourly wage of effi- cient workers. We observe how crowd work experiences result in different types of working strategies, productivity levels, quality and diversity of the crowdsourced judgments. @en