Digital technologies transform the ways in which natural science collections are managed and used. Digitization initiatives around the world lead inevitably to digital representations in cyberspace for physical specimens in collections that become increasingly semantically meaningful (and thus machine-actionable) as well as increasingly acting as the mutable space/place for curation of all data (first- and third-party) derived from and relating to that physical specimen. The notion of universal and stable persistent identifiers (PID) for these ‘digital specimens’ is central to museums’ ambitions for widening access, and to proposed notions of Extended Specimens (Webster et al., 2017, Lendemer et al. 2019) and Next Generation Collections (Schindel and Cook 2018). PIDs act as a digital doorway that allows us to do more than just find specimens. A wide variety of novel first and third-party services become possible, including for example: harmonizing the arrangement of loans and visits, finding specimens related to one another (think: ‘frequently bought together’, ‘customers who viewed this also viewed these’), linking to third-party information, and providing support to Access and Benefit Sharing. Such services in natural sciences can be compared to those enabled by Digital Object Identifiers (DOI) and offered by Crossref, such as Cited-by, Metadata API, and Event Data; or, as in the example of Entertainment Identifiers (EIDR) the more than 25 applications identifying and tracking content production and distribution in the film and TV entertainment supply chain – from studio to theatre, TV or mobile device.
To avoid fragmentation along national and/or regional lines, the global natural science collections community urgently needs action towards a common global scheme for persistent unambiguous and actionable identification of digital specimens and collections. We propose a ‘Natural Sciences Identifier’ (NSId) scheme based on the Handle system in a joint international governance arrangement under the Alliance for Biodiversity Knowledge (https://www.allianceforbio.org/).
Such a mechanism must work for persistently identifying digital objects on timescales typical of the natural sciences collections i.e., from decades to centuries. This mechanism must be independent of and resistant to specific implementation technologies to achieve that. It is reflective for a moment to compare with International Standard Book Numbers (ISBN), which have been in use since the mid-1960’s to identify each edition and variation of books published. The NSId in such a form represents everything a digital/extended specimen stands for, rendering each one unambiguously findable, accessible, and reusable for future science, commercial, policy and societal purposes, and establishing a trusted ‘brand’ over the very long term.
Reliable identifiers derive from robust services supporting persistence (minting, resolution) and machine actionability (semantics) under formal governance arrangements, which Alliance for Biodiversity Knowledge stakeholders are well placed to provide. Reliable identifiers enhance the value of collections and specimens. Identifiers that uniquely and meaningfully identify specimens and collections enhance the quality and accuracy of work. They confer authority, raising overall trust throughout the value chain (figure 1) founded in the worldwide collections of physical specimens and the digital assets arising out of digitization initiatives.
Figure 1: Value chain founded in natural science collections
At every point in the chain reliable identifiers can unambiguously identify, refer to, use, trace and track natural science objects that have their digital representations, the third-party data derived from them and the transactions involving them stored and manipulated in computer and information systems. Multiple value-adding service opportunities that can respond well to the emergence of a new Natural Sciences Identifier (NSId) scheme, unconfounded by existing schemes with their quite different object characteristics arise at all points through the chain.
A jointly governed, global, Handle-based system layered over existing institutional identification practices that allows a global PID to be created at the earliest moment is the first step towards these ambitions. Additionally, it would help to overcome limitations linked to the functionality of URIs (cf. RFC 3650). Our requirements and use cases, however, are different to those of the DOI-based and other identifier schemes. Alongside guaranteeing the association (link) between the physical specimen and its digital representation, these include (figure 2): persistence for the very long term, governance by stakeholders themselves, a brand that inspires trust and authority, and scalability (circa 30billion identifiers) with pertinent tailored services.
Figure 2: Governance, trust, scalability and persistence enabled by the scheme reinforce each other
The non-profit DONA Foundation (www.dona.net), based in Geneva, Switzerland is the neutral forum for global governance of the entire rapid-resolution, globally distributed system run by multiple groups that the public can use for resolving identifiers (Handles), of which DOIs are a significant proportion and under which we propose Natural Science Identifiers. Such a scheme would, we believe have the backing of DONA.
The time is right to establish a worldwide joint declaration of intent, and to proceed with plans to initiate a global persistent identification scheme for natural science digital object types such as digital specimens and collections.
Alex Hardisty, Dimitris Koureas, Wouter Addink
Distributed System of Scientific Collections (DiSSCo)
16th December 2019.
The Distributed System of Scientific Collections (DiSSCo) is a new world-class Research Infrastructure (RI) for natural science collections. The DiSSCo RI works for the digital unification of all European natural science assets under common curation and access policies and practices. These aim to make the data easily Findable, more Accessible, Interoperable and Reusable (FAIR).
References and further reading:
L. Lannom, D. Koureas & A.R. Hardisty. FAIR data and services in biodiversity science and geoscience. doi: 10.1162/dint_a_00034.
Lendemer et al., 2019 . doi: 10.1093/biosci/biz140;
Schindel and Cook, 2018. doi: 10.1371/journal.pbio.2006125;
Webster et al., 2017. isbn: 978-1-4987-2915-4;