Dear all,
Within NFDI4Chem and our repositories for chemistry data, we had discussions on the publisher field, where to put the repository name and why and how to add a repository identifier in the DOI metadata in DataCite’s schema. Let’s go though all of this:
Publisher: Definition from DataCite Schema v4.4:
The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role.
On citation the documentation tells:
Creator (PublicationYear): Title. Publisher. (resourceTypeGeneral). Identifier”
So what should show up here is the name of the repository, as journals are mentioned in citations for articles. This is clear and precise. Example:
S. Herres-Pawlis, F. Bach, I. Bruno, S. Chalk, N. Jung, J. Liermann, L. McEwen, S. Neumann, C. Steinbeck, M. Razum, O. Koepler, Angew. Chem. Int. Ed. 2022, 61, e2022203038. hxttps://doi.org/10.1002/anie.202203038
…and a hypothetical dataset:
Jon Doe (2023): NMR Spectra of all Structures published in PubChem. nmrXiv. (Dataset)
hxtps://doi.org/10.57992/nmrxiv.p1
The publisher field would provided the repository name.
Repository identifier: Better than a name would be an identifier. Identifiers for repositories are provided by re3data.org e.g. the identifier for RADAR4Chem is hxtp://doi.org/10.17616/R31NJNAY . Where to add this to the metadata?
The publisher field is certainly the wrong field (see above, definition and examples). Someone gave me the hint that the repo identifier might be added as a contributor with the contributorType hostingInstitution. However, I would be hesitant to add a DOI there, while all other contributors have names. Alternatively, the repository identifier could be a relatedIdentifier → relatedIdentiferType: DOI → relationType: IsPublishedIn → hxtp://doi.org/10.17616/R31NJNAY
Any opinions on that? Is is planned to add some best practice example on repository identifiers to the next version of the DataCite schema?
The use case of such a repository would be to search for all datasets published in a repository. This is, unfortunately, not possible based on the DOI prefix, as this is unique for the registrant. Following the example mentioned above, the registrant for DOIs in RADAR4Chem is FIZ Karlsruhe, but hey do also register DOIs for other repositories e.g. RADAR4Culture RADAR4Culture | re3data.org or RADAR RADAR | re3data.org
Consequently, “Find Related Works” on the commons.datacite.org page of RADAR DataCite Commons shows datasets in RADAR, RADA4Chem and RADAR4Culture, as all datasets in these repositories have DOIs with the same prefix.
Best,
Tillmann