Hi everyone,
I’m working with chemical substance identifiers as metadata for data deposits in order to more closely align the records with FAIR. I’d like to represent the identifiers using the Subject properties in DataCite (Subject, subjectScheme, schemeURI, valueURI, and classificationCode), insofar as they can include/enable linked information.
For identifiers that are computed (2.1.x , rules/software generate the identifier based on the chemical structure), I’m uncertain what to include for the schemURI and valueURI for the following, which are preferred since in general they are unique identifiers for a substance.
IUPAC Name, InChI, InChI Key, and Canonical SMILES
Other identifiers are more straightforward (2.3 on the web page), because a valueURI can be included, such as PubChem ID, and DSSTox Substance ID.
Any ideas on how to handle the computed identifiers?
Thanks!
I just ran across a suggestion for InChI and InChIKey in a presentation on Connecting Chemistry Through PIDs (see slide 18), so I’ll adopt that approach.
Hi Brian, yes indeed the approach you identify is now the topic of an IUPAC working party (FAIRSpec), in fact on NMR spectroscopy but it will include such identifiers in its recommendations. See eg R. M. Hanson, D. Jeannerat, M. Archibald, I. Bruno, S. Chalk, A. N. Davies, R. J. Lancashire, J. Lang and H. S. Rzepa, IUPAC specification for the FAIR management of spectroscopic data in chemistry (IUPAC FAIRSpec) – guiding principles, Pure App. Chem., 2022, DOI: https://doi.org/10.1515/pac-2021-2009 We hope these recommendations will emerge within the next 12 months. So hang on in there.
Brian, For some background, the first instances I think of using chemical identifiers as subject terms was described in M. J. Harvey, A. McLean, H. S. Rzepa, A metadata-driven approach to data repository design, J. Cheminform. , 2017 , DOI: 10.1186/s13321-017-0190-6
but the realisation soon dawned that the process needed to be standardized. So efforts started to gather a group of chemists interested in doing this and an IUPAC working party emerged in 2020 to do so.
These uses of the subject term were introduced in 2016, but must be considered provisional and will be subsumed by the IUPAC FAIRSpec recommendations. They were proofs of concept on what could be achieved using them.