does anyone have experience with querying OpenAIRE using PIDs either using the SPARQL or HTTP API?
For example using a Grid-ID to get an Organization and its linked projects?
Or using an ORCID to get Publications and their related projects or funding?
Would appreciate some help on how to query the knowledge research graph for these connections!
I’ve used the search API to search for datasets by ORCIDs. I haven’t used it to link to projects or funding, but the data was in the resulting JSON under oaf:result[‘rels’]. You can find my code for this here:
This code takes a relatively long time to run depending on the size of the ORCID list. It sends out one request per ORCID and there is a default sleep time between requests of 2 seconds in order to respect rate limiting.
As my use case did not involve large results sets, I didn’t bother implementing paging of the results. I just set the number of results per page relatively high. If you expect a large number of results per page, I recommend lowering the default and implementing paging.
Also bear in mind that the API I used (https://services.openaire.eu/search/v2/api/resources2) does not seem to be documented on the OpenAIRE developer site. I would be hesitant to use it in integrations without further inquiring about the stability of the API.
There is undoubtedly a better way to do the same thing in SPARQL, but as I’m not overly familiar with it or the OpenAIRE ontologies, this worked best for me. Maybe someone more knowledgeable will post a SPARQL solution.
thank you for the provided code snippet. It seems I get the same results for an ORCID as via the OpenAire EXPLORE service, which is what I’m looking for!
In the meantime I also tried to query the HTTP API that seems to include an “orcid” parameter as well, but the results always turn up empty (the parameter is not documented in OpenAIRE API documentation - Selective access but if you make a query with an unknown parameter like http://api.openaire.eu/search/datasets?format=json&hello it will be listed)
I tried using the API to query organizations as well and found some organizations with a ROR which I haven’t seen in other APIs or even in the data dumps. But here the results seem to differ from the EXPLORE service: when I query for organizations like https://services.openaire.eu/search/v2/api/resources2/?format=json&type=organizations I get a list of organizations, but when I use their name to query them in the EXPLORE service, they are not included, e.g. “Tanger Computersystems”
Unfortunately I haven’t heard from OpenAire (sent a mail + support request) so I can’t explain the differences
The explore form seems to add some additional filters. If I interpret them correctly, there should either be a link to a project, or related datasources must have a certain the relateddatasourcecompatibilityid. This does not seem to be the case for Tanger computersystems, which seems to have no related projects or outputs:
some years ago I wrote a small script to get the Funder and Project information from a given DOI, you can find the source here.
The integration of ORCID seems to be quite novel , from the blog - https://www.openaire.eu/openaire-integrates-orcid-wizard-in-explore-service .
And it seems to be for curated organizations also, check the Index and Stats update table at aggregation-and-content-provision-workflows (openaire.eu) , 2021-07-14 .
it’s been a while, but I recently documented my findings about querying the OpenAIRE HTTP API using PIDs and wanted to share it with you. Also: thanks a lot to you, Maarten and Andreas, both your programs were really helpful in exploring the data and the data model further!
OpenAIRE HTTP API: OpenAIRE API documentation - Selective Access
Metadata schema and available entities
Metadata schema: Schema documentation for oaf-1.0.xsd
While the data model describes multiple connected entities, the HTTP API only offers endpoints for research products and projects:
→ Connections between entities are made via the “rels” (=relations) or “context” property of an entity
PIDs in the HTTP API
The HTTP API describes the following identifiers that can be used to query research products or projects:
- callID - call identifier
- openaireParticipantID → this is an organization identifier, e.g. "openorgs____::8e609ea7e0fe26c86a3dea31a2ef2ce2 " for TIB
→ OpenOrgs is the mapping tool for organizational IDs developed by OpenAIRE
→ is not yet released, so queries using an organizational identifier like ROR ID might be possible at a later stage
Supported PID-queries in the HTTP API
We can query the following two connections from OpenAIRE using an ORCID iD or a DOI as input:
Person-Publication (ORCID → DOI)
- OpenAIRE lets you query different research products (publication, data, software, other) via an ORCID iD
- Each research product may be identified by its DOI
- Example: https://api.openaire.eu/search/publications?orcid=0000-0003-2499-7741&format=json
- Note: the API has several different endpoints for research products: they are divided into publications, research data, software metadata and other research products, so to get a full picture about a person’s output, you would have to query all of these endpoints with an ORCID iD and union all results
Publication-Project (DOI → OpenAIRE project ID)
Both queries are also available and documented in the pidgraph-notebooks, our collection of Jupyter notebooks with examples of querying different PID providers like ORCID, ROR, Crossref and PID graphs like the FREYA PID Graph, OpenAlex and OpenAIRE for connected objects: