Why the FAIR principles are not enough: lost data and a case for FAIR+
More Info
expand_more
Abstract
A plausible reading of the FAIR principles (Wilkinson et al. 2016) is that they were introduced with the intention to eliminate, or at least reduce, Dark Data (Dark Data being data which is non-reusable). Arguably, they derive their worth from how successful they are in this endeavour (even if only in principle). In this paper, we highlight a class of data which we call Lost Data. Lost Data is data, knowledge of whose existence has been lost to relevant interested parties. Lost Data poses a problem for the FAIR principles in so far as it is demonstrable that Lost Data both i) comply with the standards set out in the FAIR principles, and ii) nonetheless qualify as dark, by its existence being unknown to those relevant interested parties, and it’s being non-reusable, as a result. If the FAIR principles should indeed be rightly understood as an attempt to eliminate Dark Data, as we suggest, Lost Data thus poses a significant problem. Lost Data serves as a demonstration that the FAIR principles fail in this regard, and thus undermines the motivation for them. Lost Data, then, highlights the need for the augmentation of the FAIR principles to plug this identifiable “gap” in which Lost Data sits. In this paper, along with highlighting this problem, we suggest several possible changes that might be made to the FAIR principles that will help assuage it.