PID Graph for Data Repositories with Data Partner Organizations (both have RORs)

Hello,

I’m new to the PID graph. My organization implemented DOIs and RORs recently (we do not have individuals who deposit data, just organizations). We are hoping to develop a query that would grab the metrics for our data partners with datasets in our repository, making it easier for them to identify how many datasets resulted from a tranche of funding, but we can’t figure out how to include more than one ROR in a query. We have a ROR and some of our partner organizations have RORs (both included in the DataCite metadata for the datasets), but we seem to only be able to query one, not both.

Does anyone have a query that I could riff on?

3 Likes

Hey, interesting! Just to be clear, are you trying to query your own repository, or are you trying to query DataCite? And is the following a fair example of what you want: “Give me all the works published by [my organization] where the funder is [funder organization]”? If you are trying to query DataCite, @kellystathis is the expert, but there are also some good sample queries at Queries and filtering

1 Like

Hi @CRidsdale, is your question about the DataCite GraphQL API? If you could share the queries you’ve tried so far, I can take a look!

Hi Amanda, no the funder is not part of my question. It’s for my organization (data repository/publisher), and our partner organizations (owner/author) whose host data with us.

Kelly, I’ll try to find them, I was working right in PID Graph UI, and it looks like I lost the queries, thinking they would stay, but I guess not

@CRidsdale No problem! They might be under “History” (top left)?

I’m also happy to try out some queries myself. To check that I understand what you’re looking for: Are you looking to query for all datasets published by your organization (as in, DOIs that are part of your DataCite repository) with a specific org’s ROR ID associated (e.g., as the creator’s nameIdentifier)?

@KellyStathis yes, that’s exactly what I’m trying to do! Sorry, I haven’t had the chance to dig into the query again, it’s been a very busy week so far. I’ll try to dedicate some time this afternoon or tomorrow.

1 Like

@CRidsdale No problem! I had a chance to look at this today - there are a few different ways to approach the queries.

Using the DataCite REST API, you can explicitly look for a given ROR ID in different fields where ROR IDs apply.

For example, this query: https://api.datacite.org/dois?client-id=oncs.onc&query=creators.nameIdentifiers.nameIdentifier:"https://ror.org/02kkvpp62"%20OR%20contributors.nameIdentifiers.nameIdentifier:"https://ror.org/02kkvpp62"%20OR%20creators.affiliation.affiliationIdentifier:"https://ror.org/02kkvpp62"%20OR%20contributors.affiliation.affiliationIdentifier:"https://ror.org/02kkvpp62"

  • Retrieves DOIs from ONC repository (client-id=oncs.onc)
  • Queries for DOIs that have the ROR ID https://ror.org/02kkvpp62 (Technical University of Munich), using this logic:
creators.nameIdentifiers.nameIdentifier:"https://ror.org/02kkvpp62" OR 
contributors.nameIdentifiers.nameIdentifier:"https://ror.org/02kkvpp62" OR 
creators.affiliation.affiliationIdentifier:"https://ror.org/02kkvpp62" OR 
contributors.affiliation.affiliationIdentifier:"https://ror.org/02kkvpp62"

This query-based approach can essentially be replicated using the DataCite GraphQL API. There isn’t a built-in filter on works for an organization identifier, but you can use a query string instead.

  • The first parameter is to show the first 25 results - you can change this value.
  • For the query: I’m just looking at creator nameIdentifier because I see that’s where most of your ROR IDs are—but you can extend this to include contributors, creator affiliations, and contributor affiliations as in the REST API example above.
{ 
  repository(id: "ONCS.ONC") {
    name
    clientId
    works (first: 25 query: "creators.nameIdentifiers.nameIdentifier:\"https://ror.org/02kkvpp62\"") {
      totalCount
      nodes {
        doi
        titles {
          title
        }
      }
    }
  }
}

Another way to approach this in GraphQL is by using the organization type. Above, we get results from a repository and filter by organization. But you can also go in the opposite direction: get results for a given organization, then filter by repository:

{ 
  organization (id: "https://ror.org/02kkvpp62") {
    works (first: 25 repositoryId: "oncs.onc") {
      totalCount
      nodes {
        doi
        titles {
          title
        }
      }
    } 
  }
}

Any of the above approaches should produce equivalent results. Let me know if you have any questions!