PIDGraph API adoption activities

Dear All,

I feel my team has certain knowledge and usage gaps when trying to use the PIDGraph API and I would be interested to know if there are any adopter of the API in this forum.

Do you know if DataCite is promoting the PIDGraph API for user adoption?. Aside from the occasional question in this forum (and roadmap updates), we have not seen much outreach for adoption anywhere. Are there any engaged early adopters that can provide insights about using the API? or other outreach events/activities happening?

Thank you for your comments

Rupert H

2 Likes

Hello Rupert,

we are using the (FREYA) PIDGraph API in the TAPIR project, here are some usage examples GitHub - Project-TAPIR/pidgraph-notebooks: Jupyter notebooks with examples of querying different PID graphs and providers like OpenAlex, FREYA PID Graph, OpenAIRE, ORCID, ROR, Crossref. Developed at TIB as part of the BMBF funded project TAPIR.

Last year there were several talks about the PIDGraph, e.g. FAIRsFAIR Repository Support Series - The role of Repositories in enabling Persistent Identifier (PID) Graphs - Slides | Zenodo which was also recorded The role of Repositories in enabling Persistent Identifier (PID) Graphs - Webinar - YouTube

If you are looking for more examples on how to query the PIDGraph, Datacite released several Jupyter notebooks with examples like

If you want to build your own queries, you can also use the API playground https://api.datacite.org/graphql where you find all entities and their metadata fields documented (see on the right side the tabs “schema” and “docs”).

Hope this helps to get you started!
-Sandra

3 Likes

Hi Rupert,

I think @KellyStathis can help here. As I understand, Kelly is working on improving adoption of DataCite services.

And for outreach materials you can probably get help from @paul.vierkant .

I hope this helps

Best regards

Simon Norris

Hi Rupert, Simon,

Thanks for tagging me in here! At present, DataCite’s main use case for our GraphQL API has been powering DataCite Commons. @sandram has provided a great summary of the resources available to help with using the DataCite GraphQL API, which cover a variety of other use cases. I would recommend exploring the Jupyter notebooks and using the API playground as a starting point. For the sake of completeness, I will also link our support documentation here, which includes additional examples: DataCite GraphQL API Guide

@Rupert_Hawtrey, is there a specific use case you have that we can provide support with?

Cheers,
Kelly

3 Likes

Thank you @raxik76460, @sandram

Hi @KellyStathis ,

Over the last year, we studied the DataCite commons code to build our integration, which has been very useful. And we are very familiar with the notebooks (and all the pre-pandemic outreach materials). Let me thank @sandram for linking their notebooks.

The primary use case for our integration is obtaining data related to organizations (DOIs, metrics, people/ORCIDs). We are very familiar with the docs and the notebook examples for that use case. However, I think we lack know-how in alternative ways to use the API, ways in which we could take advantage of performance (we just follow the approaches in Commons), etc. That’s why we are interested in finding out who else has adopted the API.

Can you share if other repositories/apps use the PIDGraphAPI for their integrations?

Are you running any adoption webinars/workshops/demos for the PIDGraphAPI ?

I would also be interested to know if there are newer outreach materials about the API. Maybe this is a question for @paul.vierkant. The only recent outreach presentation I could find was from @aninkov (DH Toolbox: Accessing, analyzing, and visualizing research data using DataCite and Jupyter Notebooks - YouTube) but I guess this is a different project.

Thank you for your time

1 Like

Hi Rupert,

Because most of our members’ integrations are better served by the DataCite REST API, we are not aware of many cases of repositories using our GraphQL API for integrations. The GraphQL API is well-suited to cases like getting data for organizations, that would otherwise require nested queries using the REST API.

For completeness, here are DataCite’s presentations on the PID Graph:

  • The DataCite PID graph (video)
  • Introduction to the GraphQL API (pre-release version) (video)
  • The role of Persistent Identifiers and the PID-Graph for NFDI (slides, video)

The Research Data Alliance Open Science Graphs for FAIR Data IG may also be of interest.

If you have any specific requests for new outreach/training materials, please let me know. You can also provide suggestions on the GraphQL API itself via our Product Roadmap (click “Submit Idea” in the top right): DataCite - Roadmap. You can also provide feedback on our GraphQL API Guide using the “Suggest Edits” button: DataCite GraphQL API Guide

Cheers,
Kelly

1 Like

maybe some hints regarding performance:

  • if you are querying the GraphQL API with pagination and expect a lot of results, set the ‘first’ parameter to its max (=1000) which will minimize the number of requests you are sending
  • Datacite Commons connects different APIs “under the hood”, for example if you are querying it for an organization’s information only, it will make an API call to ROR which is quite fast. On the other hand when you use the connection organization → people, it will additionally query Wikidata and the ORCID API (slower).
    As far as i know, there is no documentation of the connections/API calls, but the code is open source in GitHub - datacite/lupo: DataCite REST API.

==> the performance of the GraphQL API depends heavily on the different APIs it is querying.

3 Likes

Thanks, @sandram @KellyStathis,

@sandram, that is good to know. However, many of our requests involve requesting single resources with their connections. We suspect that the request performance is affected by the connections, but our queries do not differ much from the ones in the codebase of DataCite Commons.

Because most of our members’ integrations are better served by the DataCite REST API, we are not aware of many cases of repositories using our GraphQL API for integrations.

Thank you, @KellyStathis, for the links. Considering that there are few adopters, are you planning any activities to improve adoption?

Thank you all for your support and time

1 Like

Hi @Rupert_Hawtrey, thanks for following up and apologies for the delayed reply! Currently, GraphQL is primarily supporting DataCite Commons and we don’t have planned adoption activities. We are currently exploring how to improve performance—as @sandram notes, certain queries can be quite slow!—before focusing on adoption.

That said, we’re always interested in feedback on the API and our current documentation from early adopters! I’ve made a note that it would be helpful to expand our docs to state what underlying queries each call makes, for example to the ROR/Wikidata/ORCID APIs.

3 Likes

thanks :melting_face: :grinning: :grinning: :grinning: :grinning: :grinning: :grinning: